\
Insight
\
\
Insight
\
Ben Colman
Co-Founder and CEO
Imagine attending a virtual conference and talking with trusted colleagues and industry experts on a panel. You assume the people are real, judging by the clarity of the images and the lifelike movements. You’ve seen them in person. You’ve spoken with them and laughed with them dozens of times. The voices seem authentic.
But lurking beneath that trust is a critical vulnerability. We can no longer assume that the people we see interacting in videos are who they appear to be. Deepfakes have advanced to a point where synthetic videos are indistinguishable from the real thing.
To illustrate this point, have a look below at the video of me speaking at the Web Summit last year. The talk was real — I was there, on that stage. But the video is wholly artificial, created from a single photograph.
Every gesture, every movement, every frame you're watching never existed. It's a deepfake generated from a single photograph.
People are finding it increasingly difficult to tell the difference between real and synthetic videos of people. A recent study of more than 2,000 UK and U.S. consumers found that when presented with a mix of real and deepfake images and videos, only 0.1% correctly identified what was real and what was fake.
The evolution of deepfakes has reached an inflection point. Google's newest AI video generator, Veo 3, produces clips so realistic most people wouldn't know they weren’t made by human filmmakers and actors. As Google's Veo 3 floods the internet with real-looking clips, you could say we’ve left the uncanny valley into an era of indistinguishable synthetic media.
Anyone can create convincing deepfakes now. Veo 3 puts this technology in the hands of Google AI subscribers, but any user with a smartphone can use cheaper alternatives — like HeyGen avatars or open-source models — to generate deepfake videos with just a few clicks.
For deepfake audio, the path toward realism happened more quickly than expected: AI generated audio was fooling people nearly 1 in 4 times, a November 2024 University of Florida study found. In video, thanks to Veo 3’s AI engine, visual glitches, unnatural movements and facial distortions are now virtually undetectable. Fraudsters are weaponizing this accessibility — a deepfake attempt occurred every five minutes in 2024 — with losses exceeding $200 million during the first quarter of 2025.
As a solution, watermarking AI-generated content has fallen short. This is because watermarks are easy to remove by threat actors. For example, in a 2024 study, researchers at ETH Zurich hit an 85% success rate in their efforts to remove watermarks from AI-generated text. Watermarks can also be added to authentic content, making it appear AI-generated.
To further erode their efficacy, companies implement watermarking on an opt-in basis, creating platform gaps that fraudsters can exploit. Platforms rely on a variety of proprietary watermarking methods, so there may be no consistent way to detect all types of watermarks. Bad actors can also bypass or falsify detection markers by using tools that strip watermarks from AI-generated text.
The viral spread of a deepfake can happen faster than detection — after the damage has been done. For example, in January 2024, Taylor Swift deepfakes attracted 45 million views, 24,000 reposts over 17 hours before removal, and a 2023 deepfake of Pope Francis in a Balenciaga jacket quickly garnered more than 20 million views.
Manual detection methods — which rely on vulnerable visual and behavioral indicators — may still work, but they’re becoming less reliable as deepfakes increasingly mimic natural human movements, expressions, and interactions — even fooling experts. Shown without context, several Reality Defender team members were deceived by the convincing audio-visual synchronization of the deepfake Web Summit video referenced above, only identifying it as synthetic after technical analysis.
Deepfake technology now underpins targeted social engineering and executive impersonation efforts, enabling bad actors to convincingly mimic the voices and appearances of real individuals. CrowdStrike's 2025 Global Threat Report shows a 442% increase in voice phishing attacks in late 2024, driven by AI-generated impersonation tactics. Cybercriminals impersonate senior executives using deepfake audio or video, convincing employees to transfer funds or share sensitive information.
Financial fraud and communications compromise are already happening. In March, the finance director of a multinational company in Singapore was contacted by a scammer impersonating the CFO and nearly transferred $499,000 — before law enforcement intervened. In early 2024, an Arup employee was tricked into transferring $25 million following a video call with deepfakes posing as senior management.
With the growing difficulty in distinguishing deepfakes from authentic human content, they may go undetected for hours, months, or even longer. Mike Speirs, a director at AI consultancy Faculty, recently said time is running out for manual detection of deepfakes, with models developing at a speed and pace that is both incredible and alarming.
In recent cases, social dynamics helped fraudsters break through company controls: a convincing voice, a sense of urgency and the impersonated executive’s position of authority led employees to avoid verifying requests. Most organizations lack deepfake detection capabilities. An attack could continue for days, weeks, or months — until financial losses are discovered or external parties raise concerns.
Without real-time detection systems, these attacks could persist undetected until damage is already done. Traditional security controls fail when the attack bypasses technology entirely, exploiting the human trust in familiar voices and faces.
The future lies in detection systems analyzing multiple signals — audio patterns, visual inconsistencies, behavioral anomalies — without embedded markers.
Multi-modal approaches represent the only viable defense. Human factors, protocols and duty separation are essential, but technology supports these processes. Advanced systems analyze content across dimensions in real-time — including lip-sync accuracy, micro-expressions, voice biometrics, and contextual inconsistencies.
Solutions like Reality Defender's instant detection capability demonstrate that deepfake identification is technologically feasible, providing real-time analysis enterprises need to stop attacks before damage occurs. Multiple video detection models analyze faces in real time, with a score indicating the likelihood of AI manipulation. If deepfake technology is used to spoof identify verification flows or impersonate a trusted colleague or contact, the system alerts the user with an actionable report.
Organizations must prepare now, not react later. Investing in proactive cybersecurity measures is cost-effective. The financial burden of rebuilding reputation after a deepfake attack significantly is much bigger than implementing robust protocols ahead of time.
Reality Defender makes comprehensive deepfake detection — across video, voice, and audio — simple and accessible. Book your demo today to see our solutions in action.
\
Insights