Reddit Sues Perplexity for Alleged Industrial-Scale Data Scraping

Reddit AI Perplexity Data Scraping Legal OpenAI Tech Law

October 22, 2025

Source: The Verge AI

Data Wars

Media Hype 7/10

Real Impact 8/10

What is the Viqus Verdict?

We evaluate each news story based on its real impact versus its media hype to offer a clear and objective perspective.

AI Analysis:

While the immediate media impact is high due to the involved parties and the AI narrative, the long-term impact will be felt across the entire tech industry as data governance and AI ethics become increasingly central to development.

Article Summary

Reddit has filed a lawsuit against Perplexity, SerpApi, Oxylabs, and AWMProxy, accusing them of engaging in industrial-scale data scraping to train their AI models. The lawsuit claims that Perplexity, a competitor, is using these data scraping companies to obtain Reddit’s vast trove of user-generated content, circumventing Reddit’s protections and ignoring previous cease-and-desist letters. Reddit argues that Perplexity’s ‘answer engine’ relies heavily on this stolen data, and that the defendants are engaging in deceptive practices to mask their identities and bypass security measures. The lawsuit highlights a growing trend of AI companies seeking to acquire large datasets for training purposes, often through questionable means. Reddit’s data, representing billions of posts and conversations, is considered incredibly valuable for AI model development. Reddit's earlier API changes were aimed at monetizing this data, but the data scraping companies are seen as ‘would-be bank robbers’ determined to steal this information. Perplexity contends that it respects Reddit’s robots.txt and only uses publicly available information, but the volume of Reddit citations on its platform has increased since the initial letter.

Key Points

Reddit is suing Perplexity and several data scraping companies for illegally obtaining its content to train AI models.
Reddit alleges Perplexity is using these scrapers despite a previous cease-and-desist letter and claims that Perplexity’s ‘answer engine’ relies on stolen data.
The lawsuit underscores a growing trend of AI companies aggressively seeking large datasets, often bypassing established security protocols.

Why It Matters

This lawsuit is a significant development in the ongoing battle between content platforms and AI developers. It highlights the ethical and legal challenges surrounding data acquisition for AI training and raises concerns about the potential for intellectual property infringement and the exploitation of user-generated content. This case has broader implications for the future of online content and how it’s used to train artificial intelligence, particularly as AI models become increasingly reliant on vast quantities of data. For professionals, this news underscores the increasing legal and regulatory scrutiny surrounding AI development and deployment, and the need for robust data governance policies.

Reddit Sues Perplexity for Alleged Industrial-Scale Data Scraping

What is the Viqus Verdict?

Article Summary

Key Points

Why It Matters

You might also be interested in

AI-Designed 'Psychedelic Tofu' Offers Safer Mental Health Treatment

OpenAI Dev Day 2025: Hardware, Sora, and Copyright Chaos

TechCrunch Disrupt 2025: A Deep Dive into Innovation and Networking