
Anthropic has introduced the discharge of Claude Sonnet 4.5, which it claims is the “finest coding mannequin on the planet” and the “strongest mannequin for constructing complicated brokers.”
It achieves a 77.2% on the SWE-bench for software program engineering, in comparison with 74.5% for Claude Opus 4.1 and 72.7% for Claude Sonnet 4. For exterior comparability, GPT-5 Codex scored at 74.5%, GPT-5 scored 72.8%, and Gemini 2.5 Professional scored 67.2%.
Moreover, it leads within the OSWorld benchmark, which checks AI fashions on real-world pc duties. It scored 61.4% on that benchmark, beating out Claude Sonnet 4, which scored 42.2%.
“Sonnet 4.5 can produce near-instant responses or prolonged, step-by-step pondering that’s made seen to the person,” Anthropic says.
In keeping with Anthropic, Claude Sonnet 4.5 additionally reveals higher domain-specific information and reasoning within the fields of finance, legislation, and drugs.
This mannequin performs higher on security and alignment evaluations, the corporate claims. It reveals a discount in behaviors akin to sycophancy, deception, power-seeking, and the tendency to encourage delusional pondering, in addition to exhibiting progress on having the ability to defend in opposition to immediate injection assaults.
The pricing for Claude Sonnet 4.5 is similar as Claude Sonnet 4’s pricing: $3 per million enter tokens and $15 per million output tokens.
Alongside the launch of Claude Sonnet 4.5, Anthropic additionally introduced updates throughout a number of of its merchandise. Claude Code now has checkpoints that enable builders to avoid wasting their progress and roll again to earlier variations. The Claude API received a brand new context modifying function and reminiscence device that permits brokers to run longer and deal with extra complicated duties. Moreover, all Claude apps now have entry to code execution and file creation.
The corporate can also be releasing the Claude Agent SDK, which builders can use to construct their very own brokers utilizing the identical infrastructure Anthropic makes use of to energy Claude Code.
“We constructed Claude Code as a result of the device we wished didn’t exist but. The Agent SDK offers you an identical basis to construct one thing simply as succesful for no matter drawback you’re fixing,” Anthropic wrote in a weblog submit.