Summary

  • AI company Anthropic has upgraded its flagship model, Claude Opus 4.1, improving its performance in software engineering tasks.
  • It achieved a score of 74.5% on SWE-bench Verified, a benchmarking test, higher than OpenAI’s o3 model (69.1%) and Google’s Gemini 2.5 Pro (67.2%).
  • While this positions Anthropic as a leader in AI-powered coding assistance, nearly half of its 1.4bn.
  • The new model has been released ahead of the anticipated launch of OpenAI’s GPT-5, heightening fears that Anthropic is trying to maintain its dominance in the AI coding market.
  • Safety concerns remain paramount, with Opus 4.1 launched with strict safety protocols after previous models exhibited concerning behaviours during AI blackmail tests.

By Michael Nuñez

Original Article