Anthropic Shifts to User-Generated Training Data, Raises Privacy Concerns
8
What is the Viqus Verdict?
We evaluate each news story based on its real impact versus its media hype to offer a clear and objective perspective.
AI Analysis:
While the shift itself isn't entirely novel, the scale of data collection and the default setting create significant public awareness and discussion, driving substantial media coverage, indicating a high level of interest and potential controversy.
Article Summary
Anthropic, the AI firm behind the Claude chatbot, is implementing a significant change to its AI model training methodology. Starting September 28, 2025, the company will begin training its Claude AI models directly on user-generated data, including new chat transcripts and coding sessions. This represents a move away from solely relying on publicly available datasets. Crucially, users have the option to opt out of this data collection, but the default setting is ‘on’. The change extends Anthropic's data retention policy to five years, allowing for more comprehensive model training. This development introduces heightened privacy concerns, particularly regarding the potential misuse or analysis of sensitive user conversations. The update applies to all consumer subscription tiers – Claude Free, Pro, and Max – and also includes Claude Code usage via Amazon Bedrock and Google Cloud's Vertex AI. However, it does *not* affect Anthropic’s commercial usage tiers. Users can adjust their preferences via a pop-up notification, but changes only apply to future data, not past sessions. Anthropic emphasizes data filtering and obfuscation techniques to protect user privacy, but the fundamental shift towards user-generated data remains a key development.Key Points
- Anthropic will begin training its Claude AI models on user chat transcripts and coding sessions by default.
- Users have the option to opt out of this data collection, but the default setting is ‘on’ and data will be retained for five years.
- This shift raises significant privacy concerns given the potential for sensitive user data to be used in model training.