Anthropic's Opus 4.5 Model Rolls Out, Addressing Key User Pain Points
8
What is the Viqus Verdict?
We evaluate each news story based on its real impact versus its media hype to offer a clear and objective perspective.
AI Analysis:
While the release is a notable advancement, the improvements are primarily focused on refinement and usability, suggesting a measured approach aligned with Anthropic’s philosophy. The score reflects the real-world impact of improved usability and efficiency, alongside sustained media attention.
Article Summary
Anthropic has unveiled Opus 4.5, a significant update to its flagship Claude model, designed to directly address a major frustration for users. The most notable change is a redesigned conversation management system that prevents abrupt terminations of long conversations, a common issue with previous Claude versions. Instead of simply cutting off conversations exceeding a 200,000-token window, Opus 4.5 now intelligently summarizes earlier parts of the conversation, discarding extraneous information while retaining key details. Beyond this core improvement, Opus 4.5 boasts enhanced coding performance, achieving an 80.9% accuracy score in the SWE-Bench Verified benchmark, surpassing GPT-5.1-Codex-Max and Gemini 3 Pro. Furthermore, the model demonstrates improved efficiency with token usage, significantly reducing output tokens compared to previous models. Anthropic has also added a new 'effort' parameter for developers, enabling precise control over efficacy and token consumption. The launch includes expanded access to Claude Code in desktop apps and a drastically reduced API pricing structure ($5 input/$25 output per million tokens).Key Points
- Opus 4.5 addresses the issue of abrupt conversation terminations in Claude models, preventing incoherent conversations after prolonged use.
- The model achieves an 80.9% accuracy score on the SWE-Bench Verified benchmark, exceeding GPT-5.1-Codex-Max and Gemini 3 Pro in coding performance.
- Opus 4.5 significantly improves token efficiency, offering a substantial reduction in output tokens for comparable results.