Anthropic released Claude Sonnet 5, a new version of its mid-tier model, around June 30, 2026. In a third-party physics-simulation comparison, it matched the accuracy of more expensive higher-end models while sharply cutting token usage and cost.
June 30, 2026 · Anthropic
Claude Sonnet 5 matches top models on physics code — using half the tokens
In a same-prompt test building three HTML5 Canvas crash demos, Sonnet 5 hit comparable accuracy to GPT-5.5 and Opus 4.8 while spending the fewest tokens at the lowest cost — Anthropic's "most agentic Sonnet."
15,047
tokens used — fewest of the group
$0.15
task cost — lowest of the group
~33%
cheaper API pricing than Sonnet 4.6
Tokens per task — same prompt, four models
Column height is proportional to tokens consumed. Lower is better.
Cost per task ($)
Column height is proportional to dollar cost. Lower is better.
The task & the verdict
One prompt: build three self-contained physics crash demos on HTML5 Canvas.
🚗 Car into a brick wall
🏚 Wrecking ball vs house
🏰 Catapult vs castle wall
Sonnet 5 beat Opus 4.8 on the wrecking-ball test and topped GPT-5.5 on the catapult test — though its graphics detail still trailed the others.
Supporters say
In agentic loops the token chain grows long, so fewer tokens per task compound into lower long-term cost and faster runs — a real edge for building things like learning games.
Skeptics say
A one-off demo can't be generalized; Opus 4.8 may just be the weaker baseline. A separate analysis reportedly rated Opus 4.8 higher and Sonnet 5 pricier — failure-case behavior still needs testing.
Continue reading The rest of this article is for AI News Blitz readers. Choose an option below to keep reading.
Already purchased? Sign in ✓ Signed in — this article isn’t included in your current plan.Unlocking the full article…