As AI agents spread across corporate workflows, companies' AI token consumption and spending are surging, prompting some to repurpose cost-management techniques honed during the cloud era to rein in expenses.
2026 · Corporate AI Spend
The Token Bill Comes Due
As AI agents chain multi-step reasoning and tool calls, token consumption is exploding — and big U.S. companies are reaching for cloud-era cost controls: budget caps, monitoring and policy enforcement. Uber burned through its 2026 budget in three to four months.
$1,500
Uber's monthly token cap per employee for internal coding agents
3.2Q
Tokens Google processes monthly — 7× the prior year
$1,200
Spent by Uber's CTO during a single two-hour demo
Token consumption is set to surge 24×
Goldman Sachs forecast for agentic AI, 2026 → 2030 (tokens processed per month)
Q = quadrillion tokens / month · even as unit prices fall, rising volume pushes totals higher
The waste problem: most tokens never ship
Only 18% of AI coding tokens contribute to shipped product. Retry loops, context bloat and poor model selection consume the rest — making ROI hard to justify.
The fix: FinOps for AI tokens
Visibility, tagging, budget caps & per-session limits
Prompt caching — cache reads at $0.30/M tokens, 90% below the usual $3.00
Model routers (e.g. Factory) switch by task between cheap and premium models
Agent sprawl is spreading
Microsoft cancels much of its internal Claude Code licensing
DaVita, Lyft & GitLab tighten internal-platform management
AI now reaches up to 50% of IT spend at some firms
Continue reading The rest of this article is for AI News Blitz readers. Choose an option below to keep reading.
Already purchased? Sign in ✓ Signed in — this article isn’t included in your current plan.Unlocking the full article…