Richard Lawler is a senior editor following news across tech, culture, policy, and entertainment. He joined The Verge in 2021 after several years covering news at Engadget.
A little over a year after releasing two “open” Gemma AI models built from the same technology behind its Gemini AI, Google is updating the family with Gemma 3. According to the blog post, these models are intended for use by developers creating AI applications capable of running wherever they’re needed, on anything from a phone to a workstation with support for over 35 languages, as well as the ability to analyze text, images, and short videos.
The company claims that it’s the “world’s best single-accelerator model,” outperforming competition from Facebook’s Llama, DeepSeek, and OpenAI for performance on a host with a single GPU, as well as optimized capabilities for running on Nvidia’s GPUs and dedicated AI hardware. Gemma 3’s vision encoder is also upgraded, with support for high-res and non-square images
15 Comments
CamperBob2
Technically, the 1.58-bit Unsloth quant of DeepSeek R1 runs on a single GPU+128GB of system RAM. It performs amazingly well, but you'd better not be in a hurry.
ForTheKidz
…until this coming tuesday? …let's talk value.
EDIT: I do feel like a fool, thank you.
impure
It’s a 27B model, I highly doubt that.
RandyRanderson
So says Gemma 3.
ChrisArchitect
Google post from last week: https://blog.google/technology/developers/gemma-3/
cwoolfe
Apparently it can also pray. Seriously, I asked it for biblical advice about a tough situation today and it said it was praying for me. XD
odysseus
Does it run on the severed floor?
zeroq
I call it the biggest bs since I had my supper.
grej
It lasted until Mistral released 3.1 Small a week later. Such is the pace of AI…
pram
Gemma 3 is a lot better at writing for sure, compared to 2, but the big improvement is I can actually use a 32k+ context window and not have it start flipping out with random garbage.
williamDafoe
Does anyone use GoogleAI? For an AI Company with an AI Ceo using AI language translation, I think their actual GPT products are all terrible and have a terrible rep. And who wants their private conversation shipped back to google for spying?
timmg
I'm wondering how small of a model can be "generally intelligent" (as in LLM intelligent, not AGI). Like there must be a size too small to hold "all the information" in.
And I also wonder at what point we'll see specialized small models. Like if I want help coding, it's probably ok if the model doesn't know who directed "Jaws". I suspect that is the future: many small, specialized models.
But maybe training compute will just get to the point where we can run a full-featured model on our desktop (or phone)?
LeoPanthera
Maybe Llama 3.3 70B doesn't count as running on "one GPU", but it certainly runs just fine on one Mac, and in my tests it's far better at holding onto concepts over a longer conversation than Gemma 3 is, which starts getting confused after about 4000 tokens.
m00x
I found Mistral Small 3.1, which released slightly after Gemma 3, much better.
Much fewer refusals, more accurate, less babbling, generally better, but especially at coding.
pretoriusdre
My instinct is that it would be cheaper overall to buy API credits when needed, compared with buying a top-of-the-line GPU which sits idle for most of the day. That also opens up access to larger models.