Reality Defender Recognized by Gartner® as the Deepfake Detection Company to BeatRead More

\

Insight

\

Detection Is One Part of the Stack: A Conversation With Yoel Roth

Ben Colman

Co-Founder and CEO

We recently announced the formation of Reality Defender's Ethics Committee. Its founding members are Keith Enright (chief strategy officer at Harvey AI, formerly chief privacy officer at Google), Luciano Floridi (founding director of the Yale Digital Ethics Center), and Yoel Roth (SVP of trust and safety at Match Group, formerly head of trust and safety at Twitter). The committee will advise us on the policy, governance, and accountability questions that come with building detection at enterprise scale.

To mark the announcement of the Ethics Committee, I interviewed each of the members. Below is my conversation with Yoel Roth, who built and led trust and safety at Twitter for more than seven years before joining Match Group. He argues that detection isn't a one-and-done fix but a continuously evolving discipline that has to be paired with the rest of the stack. His answers are practical, candid, and worth reading carefully.

How does your experience battling platform manipulation at a global scale inform your approach to advising an AI detection company like Reality Defender?

Combatting adversarial behavior inevitably involves tradeoffs: Between safety and privacy, between security and user friction, etc. And invariably, even good solutions to integrity problems have downsides: systematic bias, inadequate user recourse, and so on. The work I've done throughout my career has been about trying to find the right balance point across all these factors, recognizing that it's a dynamic equilibrium and there are seldom obvious or universally right answers. I take that same mindset when advising companies building in this space: My role is to help you balance often competing interests to enable execution in a constantly changing business and adversarial environment. I see my job as helping to put tradeoffs on the table for consideration, so collectively we can all make better informed choices.

Platforms often struggle to balance free expression with the need to combat malicious synthetic media. Where do you see the most significant gaps in how major networks currently handle AI-generated content and deepfakes?

You can't view any one solution or approach in isolation. Tech like RD's is a key part of a detection stack, but it has to be paired with a range of other technologies, both up- and down-funnel, to keep up with adversarial behavior. The biggest gaps I see are firms that believe synthetic media can be solved with one approach, or that implementing a technology is a one-and-done fix; it isn't, and ongoing investment is a must.

How has the barrier to entry for threat actors changed with the mainstreaming of generative AI, and what does a successful defense strategy look like today?

This is a tricky one: On one hand, it seems pretty self-evident that mass-market AI tools (most notably LLMs) have "democratized" access to the infrastructure for manipulation at scale. But it's still not obvious that we're yet seeing the predicted apocalyptic influx of genAI content. I think back to when StyleGAN first emerged, and everyone predicted that social media would be overrun with impossible-to-detect synthetic faces; what we instead saw at Twitter was that bad actors largely just downloaded the example images from ThisPersonDoesNotExist and recycled those. It's not that the threat is unrealistic or overblown; it's just that bad actors are integrating these technologies in erratic ways that don't line up with an expectation that they'll always, immediately adopt the newest and most sophisticated tool.

In your experience, how should platforms and organizations translate the technical detection of a deepfake into effective policy and moderation actions?

Start with what your users want, and if you don't know what they want, do research until you do. If you ask your users/customers/etc what they want you to do to address a given issue (e.g. deepfake media), they'll often be able to articulate, with significant depth, their preferences and perceived tradeoffs. This can guide when you moderate in a punitive way (removals, bans, etc) vs adding context through labels. And you have to stay abreast of evolving cultural norms around tech usage: In the dating context, for example, we see people commonly using "AI" apps to do things like retouch their photos to edit out blemishes. That's not really "malicious" as we (or our users) understand it, and treating any use of AI as de facto harmful would overlook that nuance.

Looking ahead, what do you consider the most significant blind spot for the Trust & Safety community when it comes to the rapid advancement of multimodal (audio, video, and text) generative AI?

Content moderation already struggles with a world in which human reviewers may not have the ability to reliably identify whether content is AI-generated or not. The entire field is built around the idea that, at some point in the process, a human can make a reliable, conclusive determination of whether a given piece of content or behavior violates a platform's rules. What happens when a human can't make that decision, because the evidence they'd need isn't something you can directly observe with just your eyes? This is already a challenge when it comes to AI-generated imagery, and as the "tells" of AI-generated content become harder and harder to spot, moderation will need to evolve to something that more readily integrates tech as an assistive tool when making moderation decisions.


About Yoel Roth

Yoel Roth is a trust and safety practitioner and researcher. He is the Senior Vice President of Trust & Safety at Match Group, the parent company of Tinder, Hinge, and dozens of dating apps used by millions of people worldwide. He is also a Non-Resident Scholar at the Carnegie Endowment for International Peace, where his research, teaching, and writing focus on trustworthy governance approaches for social media, AI, and other emerging technologies.

Previously, Yoel was the Head of Trust & Safety at Twitter. For more than 7 years, he helped build and lead the teams responsible for Twitter's content moderation, integrity, and election security efforts.

Yoel received his PhD from the Annenberg School for Communication at the University of Pennsylvania. His research examined the technical, policy, business, and governance dynamics of app stores, social networking, and online dating.


Read the announcement of Reality Defender's Ethics Committee.