Anthropic overtakes OpenAI: Claude Opus 4 codes seven hours nonstop, sets record SWE-Bench score and reshapes enterprise AI
1 min read
Summary
AI company, Anthropic, has unveiled two new AI models, Claude Opus 4 and Claude Sonnet 4, which are more advanced than previous models and maintain focus for longer.
During testing at Rakuten, the Opus 4 model worked on a complex open-source refactoring project for nearly seven hours, marking a significant extension to the fleeting attention spans of earlier AI models.
This demonstrates how AI systems are now capable of working on complex software engineering projects from start to finish without the need for human intervention.
Claude Opus 4 has achieved a 72.5% score on SWE-bench, a software engineering benchmark, higher than OpenAI’s GPT-4.1, which scored 55% when launched in April.
This new hybrid reasoning model has sparked a transformation in how people are using AI, with users increasingly regarding it as a strategic partner for complicated problems.