BREAKING
Claude Sonnet 5 wins efficiency test
Tokens used per task
Sonnet 5
15047
Opus 4.8
23063
Sonnet 4.6
25824
GPT-5.5
31152
0
$
Sonnet 5
0
$
Opus 4.8
0
$
GPT-5.5
What the test asked models to build
1
Car hits wall
↓
2
Wrecking ball
↓
3
Catapult
Strengths versus open questions
Sonnet 5 edge
Pro
●
Most agentic Sonnet
●
~33% cheaper than 4.6
●
Strong token efficiency
Skeptics say
Caution
●
One-off demo, not general
●
Weaker graphics detail
●
Failure cases untested
Efficiency claim needs more tests
AI NEWS BLITZ
Anthropic's Claude Sonnet 5 topped a physics code test using the fewest tokens.