AI/ML Security

Deepfake Attacks Are Here and Your Security Team Isn't Ready

Jun 3, 20265 min read

We built technology capable of cloning a person's voice from three seconds of audio, generating a real-time video of their face, and running both simultaneously on a live call. It took the scientific community decades and billions of dollars in research to make this possible.

The first major documented enterprise use case was fraud.

In early 2024, a finance employee at a multinational company in Hong Kong received an invitation to a video call. On the call were several of his colleagues, including the CFO. They discussed an urgent and confidential transaction. He transferred 25 million dollars. Every person on the call except him was an AI-generated deepfake. None of the colleagues were real. The CFO was not real. The call was entirely fabricated from publicly available video and audio of the actual employees.

This is not a science fiction scenario. It happened. It will happen again.

How We Got Here Faster Than Expected

The capability gap closed faster than most security teams anticipated because the inputs required to create convincing deepfakes are freely available for almost every senior person in an organization. Executive keynote speeches on YouTube. Podcast appearances. LinkedIn video posts. Earnings call recordings. The more public-facing a person is, the more training data exists for their voice and likeness.

Voice cloning in particular has become alarmingly accessible. Several commercially available tools can produce a convincing vocal clone from a short audio sample. The barrier is no longer technical. It is a question of intent and a free trial account.

The scenarios this enables go well beyond video calls. Voicemail from what sounds exactly like a manager requesting urgent action. A voice note in a messaging app that matches someone's cadence and vocabulary. Phone calls that pass the "does this sound like them" test that employees have been implicitly relying on as an authentication method for decades.

Why Traditional Security Training Does Not Cover This

Security awareness training was built around a threat model where social engineering attacks had tells. The grammar was slightly off. The email came from a domain that was almost right. The urgent request had logical inconsistencies. The attacker's accent didn't match their claimed location.

Deepfake attacks remove most of those tells. The person looks right, sounds right, and behaves consistently with how the target would expect them to behave, because the model was trained on real footage of the real person. The cognitive shortcuts employees use to validate identity do not work when the identity itself is fabricated.

There is also a broader problem: most employees have no mental model for this threat at all. They understand that phishing emails exist. The concept that a video call with their CFO might be entirely synthetic is not something that has landed as a realistic possibility for most non-security people. Attackers exploit the gap between what people believe is technically possible and what actually is.

What You Can Actually Do About It

Verification protocols that do not rely on audiovisual identity. Any request for a financial transfer, a credential change, or access to sensitive systems that arrives through any channel, including a seemingly legitimate call, should require confirmation through a separate, pre-established channel. A call that looks like your CFO authorizing a wire transfer gets confirmed via a direct call to a number already on file, not a number provided on the suspicious call.

Code words and challenge phrases for high-stakes internal communications. Pre-established, not transmitted over the suspicious channel. Simple and fast to use in practice, effective at breaking the deepfake scenario because the attacker cannot know a phrase that was agreed on internally.

Limit the public audio and video exposure of your most targeted employees where possible. This is genuinely difficult for public-facing executives, but it is worth auditing what is available and what could be removed or access-restricted without meaningful business cost.

Deploy deepfake detection tools with realistic expectations. Detection technology exists and continues to improve. It also does not catch everything and will miss well-constructed attacks. Use it as a layer, not a solution.

The underlying principle is the same one that has always applied to social engineering: verify through a channel the attacker does not control. The channel used to initiate contact is not that channel. It never was.

The Uncomfortable Trajectory

The deepfake that cost 25 million dollars in 2024 required meaningful resources and technical sophistication. The same capability will be considerably cheaper and more accessible in two years. The attack technique that felt exotic when it first appeared in headlines tends to become commodity within a few years.

Building the verification habits and internal protocols now, before this is a common attack vector in your industry, is considerably easier than retrofitting them after an incident. The window exists. Use it.