AI news

Google Releases DiffusionGemma, a 4x Faster Open Text Generation Model

Google launches DiffusionGemma, an open experimental model that generates text up to 4x faster using diffusion instead of autoregressive decoding.

FounderBuilt AI News · 11/06/2026 · 1 min read

What happened

Google has released DiffusionGemma, an open experimental model that generates text up to 4x faster than traditional language models by using diffusion-based decoding instead of the standard autoregressive approach. The model is available as open source and NVIDIA is already hosting a free endpoint for developers to try it.

Why it matters

Unlike conventional LLMs that generate tokens one at a time, DiffusionGemma uses a diffusion process that produces text in parallel, similar to how image diffusion models work. This makes it particularly suited for speed-critical and interactive local workflows, including pair programming and edge device deployment where low latency is essential.

What's next

Early users report the speed advantage makes it feel like a fundamentally different interaction -- more like real-time pair programming than waiting for an agent to complete its work. The release signals Google continued investment in alternative model architectures beyond the standard transformer decoder, opening new possibilities for on-device AI.