Claude Opus 4.1 Hits 74.5% on SWE-bench, Outperforming Claude Opus 4 and Competing with GPT-4 and Gemini Ultra

Simba Gondo

Translate this article

Updated:

August 9, 2025

Anthropic has announced the release of Claude Opus 4.1, an updated version of its Claude Opus 4 model, focusing on improvements in agentic tasks, real-world coding, and reasoning. The new model is now available to paid Claude users, as well as through Claude Code, Anthropic’s API, Amazon Bedrock, and Google Cloud’s Vertex AI. Pricing remains unchanged from Opus 4.

Key Improvements in Claude Opus 4.1

Claude Opus 4.1 demonstrates notable advancements, particularly in coding performance, achieving a 74.5% score on the SWE-bench Verified evaluation. This reflects enhanced capabilities in handling complex coding tasks, with specific strengths in multi-file code refactoring. According to GitHub, Opus 4.1 shows improvements across most capabilities compared to its predecessor, with significant gains in managing intricate codebases.

Rakuten Group highlighted Opus 4.1’s ability to make precise corrections within large codebases without introducing unnecessary changes or bugs, making it a valuable tool for daily debugging tasks. Windsurf reported that Opus 4.1 delivers a one standard deviation improvement over Opus 4 on their junior developer benchmark, marking a performance leap comparable to the transition from Claude Sonnet 3.7 to Sonnet 4.

Beyond coding, Opus 4.1 strengthens Claude’s skills in in-depth research and data analysis, particularly in detail tracking and agentic search, enabling more accurate and efficient handling of complex tasks.

How to Access Claude Opus 4.1

Anthropic recommends upgrading to Opus 4.1 for all users of Opus 4. Developers can access the model via the API using the identifier claude-opus-4-1-20250805. Additional details are available on Anthropic’s system card, model page, pricing page, and documentation.

Anthropic encourages user feedback to refine its models and plans to roll out more substantial updates in the coming weeks. This release underscores their ongoing commitment to improving AI performance in practical, real-world applications.

airesearch and innovation

About the Author

Simba Gondo