Some large customers of Anthropic and OpenAI are cutting their AI costs by more than 90% by switching from flagship models to cheaper options within the same lineups, as the lower-cost models prove "good enough" for a growing range of tasks.
AI Cost War · Enterprise Spending
Your AI bill, cut by 90%+ — without changing vendors
Big customers of Anthropic and OpenAI are slashing costs by routing everyday tasks to cheaper models inside the same lineup. "Good enough" tiers now cover a growing share of the work — piling pressure on premium-priced revenue.
90%+
Bill reductions reported by switching to cheaper tiers
30–70%
Realistic savings with cost-optimization tools
90×
Optimized open model vs Claude Opus 4.1, matching performance
Same task, two bills
A real Claude API case: monthly cost before vs after moving to the cheaper Haiku tier.
≈ 95% lower — a ~21× drop for the same routine work
Inside OpenAI's GPT-5.6 menu
Est. price per 1M tokens (input / output) across three tiers — limited preview, June 2026.
Terra
$2.50 / $15
Mid tier
Luna
$1 / $6
Cheapest, high-volume
At Anthropic the gap is similar: Haiku runs 5×+ cheaper than Opus , and prompt caching adds up to 90% off cache reads .
Good enough — for many tasks
Routine code generation, data aggregation, simple Q&A
Some report Haiku preserved — or improved — quality
Reaching practical levels on SWE-bench
Not a full replacement
Flagships still lead on complex reasoning & coding
Avoiding quality drops needs a routing strategy
Routing adds operational burden
How the savings happen
Score task complexity
→
Route to cheapest capable tier
→
Cache prompts & trim context
→
90%+ lower bill
Continue reading The rest of this article is for AI News Blitz readers. Choose an option below to keep reading.
Already purchased? Sign in ✓ Signed in — this article isn’t included in your current plan.Unlocking the full article…