On May 22, 2025, Anthropic released Claude Opus 4 and Claude Sonnet 4. The announcement calls Claude Opus 4 “the world’s best coding model,” reporting a leading SWE-bench score of 72.5 percent, with Claude Sonnet 4 close behind at 72.7 percent on the same benchmark.
The release was notable less for raw benchmark numbers than for endurance. Anthropic said Opus 4 “delivers sustained performance on long-running tasks that require focused effort and thousands of steps, with the ability to work continuously for several hours.” The models added capabilities aimed at autonomous work, including parallel tool use and improved memory for maintaining context over extended interactions, “significantly expanding what AI agents can accomplish.”
This mattered because it crystallized a shift in what frontier models were being designed to do. Where earlier systems answered a prompt, Claude 4 was positioned to carry out extended, multi-step work with minimal supervision - the foundation for the agentic coding tools that defined the following year. Combined with Claude Code, it pushed software development toward AI agents that take on whole tasks rather than autocomplete lines.