Breaking

Anthropic Releases Claude Sonnet 5, Matching Larger Models on Physics Demos

June 30, 2026 at 23:15 EDT

Foundation Models
AI Agents

Anthropic released Claude Sonnet 5, a new version of its mid-tier model, around June 30, 2026. In a third-party physics-simulation comparison, it matched the accuracy of more expensive higher-end models while sharply cutting token usage and cost.

June 30, 2026 · Anthropic

Claude Sonnet 5 matches top models on physics code — using half the tokens

In a same-prompt test building three HTML5 Canvas crash demos, Sonnet 5 hit comparable accuracy to GPT-5.5 and Opus 4.8 while spending the fewest tokens at the lowest cost — Anthropic's "most agentic Sonnet."

15,047

tokens used — fewest of the group

$0.15

task cost — lowest of the group

~33%

cheaper API pricing than Sonnet 4.6

Tokens per task — same prompt, four models

Column height is proportional to tokens consumed. Lower is better.

15,047

Sonnet 5

23,063

Opus 4.8

25,824

Sonnet 4.6

31,152

GPT-5.5

Cost per task ($)

Column height is proportional to dollar cost. Lower is better.

$0.15

Sonnet 5

$0.39

Sonnet 4.6

$0.58

Opus 4.8

$0.94

GPT-5.5

The task & the verdict

One prompt: build three self-contained physics crash demos on HTML5 Canvas.

🚗 Car into a brick wall

🏚 Wrecking ball vs house

🏰 Catapult vs castle wall

Sonnet 5 beat Opus 4.8 on the wrecking-ball test and topped GPT-5.5 on the catapult test — though its graphics detail still trailed the others.

Supporters say

In agentic loops the token chain grows long, so fewer tokens per task compound into lower long-term cost and faster runs — a real edge for building things like learning games.

Skeptics say

A one-off demo can't be generalized; Opus 4.8 may just be the weaker baseline. A separate analysis reportedly rated Opus 4.8 higher and Sonnet 5 pricier — failure-case behavior still needs testing.

Continue reading

The rest of this article is for AI News Blitz readers. Choose an option below to keep reading.