\
Insight
\
\
Insight
\
Ben Colman
Co-Founder and CEO
X's latest announcement to allow AI chatbots to write Community Notes represents another step in the platform's evolution of crowd-sourced fact-checking. While the move promises to accelerate and scale content moderation, it fundamentally misunderstands the nature of modern digital threats — particularly deepfakes. The core problem isn't just about speed or scale; it's about timing.
Community Notes is a Twitter-era feature that Elon Musk has expanded under his ownership of the service, now called X. Users who are part of this fact-checking program can contribute comments that add context to certain posts, which are then checked by other users before they appear attached to a post. Now, with the addition of AI writers, X hopes to speed up this process. But here's the critical issue: Community Notes are inherently reactive.
Think about how this system works: Someone posts misleading content (potentially a deepfake). The content spreads virally, reaching thousands or millions. Eventually, someone flags it for a Community Note. An AI (or human) writes a note. Other users review and approve the note. Finally, the note appears on the original post.
By the time this process completes — even with AI acceleration — the damage is done. The misinformation has already spread far beyond the original post through screenshots, shares, and reposts. It's like trying to recall contaminated food after it's already been consumed.
The asymmetry between original posts and their corrections is staggering. A viral post containing a deepfake can reach millions within hours. But how many people return to see the Community Note added later? The answer is devastating for those hoping crowd-sourced fact-checking can solve our misinformation crisis.
Consider this: when a deepfake video of a CEO announcing bankruptcy spreads across social media, it can tank stock prices within minutes. By the time a Community Note appears hours or days later, investors have already panic-sold, employees have updated their resumes, and competitors have seized the narrative. The correction reaches a fraction of the original audience — those few who happen to revisit the post.
X's introduction of AI-powered Community Notes writers introduces new risks. According to a paper published this week by researchers working on X Community Notes, it is recommended that humans and LLMs work in tandem. Human feedback can enhance AI note generation through reinforcement learning, with human note raters remaining as a final check before notes are published.
But several problems emerge. Large language models excel at generating convincing text — regardless of accuracy. An AI optimized for creating "helpful" Community Notes might produce beautifully written, seemingly authoritative corrections that are themselves incorrect. OpenAI's ChatGPT, for example, recently experienced issues with a model being overly sycophantic. If an LLM prioritizes "helpfulness" over accurately completing a fact-check, then the AI-generated comments may end up being flat out inaccurate.
Additionally, AI systems trained on existing Community Notes will inherit their biases. If the current system already struggles with certain types of misinformation, AI writers will amplify these blind spots. With AI now able to propose notes, bad actors can deploy their own AI systems to flood the zone with misleading "corrections," creating a new battlefield where competing AIs fight over narrative control.
The challenges of reactive fact-checking become even clearer when we look at Meta's struggling Community Notes pilot. Based on the screenshots collected, a reporter concluded "it's clear that Community Notes on Facebook and Instagram is not yet ready for prime time." After three months, Meta's version of the system remains deeply flawed.
One of the biggest concerns is manipulation by co-ordination. Given the presence of organized troll networks on social media, there is a high risk that co-ordinated groups could misuse the program to flag legitimate content as misinformation. Even X's more mature system faces this issue — less than nine per cent of submitted notes reach the general audience.
Deepfakes represent a unique challenge that Community Notes — AI-powered or not — simply cannot address effectively. Unlike traditional misinformation that might involve misleading text or out-of-context images, deepfakes operate at unprecedented speed and scale.
A single deepfake video of an executive in a compromising situation can end careers before any fact-check appears. Financial deepfakes can trigger automated trading systems, causing real economic damage in milliseconds. Deepfake voice calls can breach corporate systems before anyone realizes they've been deceived.
These aren't theoretical risks. We've already seen deepfakes used to defraud companies of millions, manipulate elections, and destroy personal lives. In each case, the damage occurred in real-time, while fact-checking limped behind.
The answer isn't to make reactive fact-checking faster; it's to detect deepfakes before they spread. This requires a fundamentally different approach: real-time detection that analyzes media as it's uploaded or shared, flagging potential deepfakes immediately. Modern deepfakes span audio, images, and even text, so effective protection requires systems that can detect manipulation across all formats simultaneously.
Rather than adding notes after the fact, deepfake detection needs to be embedded directly into communication channels — video calls, upload processes, and content distribution systems. As deepfake technology advances, detection systems must evolve in parallel through ongoing research, not just crowd-sourced opinions.
X's AI-powered Community Notes represent an evolution of an fundamentally flawed approach. No matter how fast AI can write fact-checks, they'll always arrive after the damage is done. In an era where deepfakes can destroy reputations, manipulate markets, and compromise security in real-time, reactive measures are simply inadequate.
The tech industry needs to recognize that the deepfake threat requires proactive, integrated detection systems that operate at the speed of modern communication. Anything less leaves individuals, enterprises, and institutions vulnerable to attacks that move faster than any fact-checker — human or AI — can respond.
As we've seen with Meta's struggling pilot and X's low success rate for displayed notes, even the best reactive systems fail to protect against sophisticated threats. The future of digital security doesn't lie in better corrections — it lies in preventing deception before it spreads.
Reality Defender provides real-time, multimodal deepfake detection that secures communication channels before damage occurs. Learn more about proactive deepfake defense here.
\
Insights