Google DeepMind's open-weight multimodal model "Gemma 4 31B" is now available in Public Preview on Cerebras Inference Cloud, delivering inference speeds exceeding 1,800 tokens per second. Around June 29, 2026, Cerebras announced the launch on its official blog, highlighting its ability to run multimodal inference with image input at very high speed. The offering is a limited-period preview accessed through Cerebras Inference Cloud, with the model ID gemma-4-31b.
Continue reading
The rest of this article is for AI News Blitz readers. Choose an option below to keep reading.
Already purchased? Sign in✓ Signed in — this article isn’t included in your current plan.