Summary

  • AI company, Anthropic, has unveiled two new AI models, Claude Opus 4 and Claude Sonnet 4, which are more advanced than previous models and maintain focus for longer.
  • During testing at Rakuten, the Opus 4 model worked on a complex open-source refactoring project for nearly seven hours, marking a significant extension to the fleeting attention spans of earlier AI models.
  • This demonstrates how AI systems are now capable of working on complex software engineering projects from start to finish without the need for human intervention.
  • Claude Opus 4 has achieved a 72.5% score on SWE-bench, a software engineering benchmark, higher than OpenAI’s GPT-4.1, which scored 55% when launched in April.
  • This new hybrid reasoning model has sparked a transformation in how people are using AI, with users increasingly regarding it as a strategic partner for complicated problems.

By Michael Nuñez

Original Article