Here’s how deepfake vishing attacks work, and why they can be hard to detect
1 min read
Summary
Deepfake scam calls, where the caller’s voice is cloned using artificial intelligence (AI), are easy to reproduce at scale and difficult to detect, according to security firm Group-IB.
The process involves collecting voice samples – sometimes as short as three seconds – and using them to train AI-based speech-synthesis engines, such as those offered by Google, Microsoft, and smaller companies such as ElevenLabs and Resemble AI.
Although such services are supposed to be used only for “beneficial purposes”, Consumer Reports found in March that the safeguards put in place could be easily bypassed.
The next step is to spoof the number belonging to the person or organisation being impersonated.
The call is then initiated, either following a script or, in more sophisticated attacks, using real-time speech generated by voice masking or transformation software, allowing the attacker to respond to questions from the recipient.