BREAKING
NVIDIA Dynamo 1.0 Boosts AI Inference 7x
0
x
throughput gain
0
x
faster TTFT
0
$
per million tokens
Open Source Under Apache 2.0
How Dynamo Scales Clusters
1
Disaggregated serving
↓
2
KV-aware routing
↓
3
Multi-tier KV cache
↓
4
Resilient inference
Adopters and Open Questions
Adopters
●
CoreWeave: resilient agents
●
Baseten: 2x faster TTFT
●
Pinterest: multimodal scale
Open Questions
●
New framework maturity
●
Kubernetes setup complexity
●
Real cost savings at scale
Production Rollout Is the Real Test
AI NEWS BLITZ
NVIDIA pairs Dynamo 1.0 with Blackwell to scale AI agent inference.