Google calls Gemma 3 the most powerful AI model you can run on one GPU by gmays

Share This Article

Sed ut perspiciatis unde.

Richard Lawler is a senior editor following news across tech, culture, policy, and entertainment. He joined The Verge in 2021 after several years covering news at Engadget.

A little over a year after releasing two “open” Gemma AI models built from the same technology behind its Gemini AI, Google is updating the family with Gemma 3. According to the blog post, these models are intended for use by developers creating AI applications capable of running wherever they’re needed, on anything from a phone to a workstation with support for over 35 languages, as well as the ability to analyze text, images, and short videos.

The company claims that it’s the “world’s best single-accelerator model,” outperforming competition from Facebook’s Llama, DeepSeek, and OpenAI for performance on a host with a single GPU, as well as optimized capabilities for running on Nvidia’s GPUs and dedicated AI hardware. Gemma 3’s vision encoder is also upgraded, with support for high-res and non-square images

Post Author

CamperBob2

Posted March 20, 2025 at 7:40 pm

Technically, the 1.58-bit Unsloth quant of DeepSeek R1 runs on a single GPU+128GB of system RAM. It performs amazingly well, but you'd better not be in a hurry.

0Likes Log in to Reply
Post Author

ForTheKidz

Posted March 20, 2025 at 7:52 pm

…until this coming tuesday? …let's talk value.

EDIT: I do feel like a fool, thank you.

0Likes Log in to Reply
Post Author

impure

Posted March 20, 2025 at 7:58 pm

It’s a 27B model, I highly doubt that.

0Likes Log in to Reply
Post Author

RandyRanderson

Posted March 20, 2025 at 8:09 pm

So says Gemma 3.

0Likes Log in to Reply
Post Author

ChrisArchitect

Posted March 20, 2025 at 8:17 pm

Google post from last week: https://blog.google/technology/developers/gemma-3/

0Likes Log in to Reply
Post Author

cwoolfe

Posted March 20, 2025 at 8:27 pm

Apparently it can also pray. Seriously, I asked it for biblical advice about a tough situation today and it said it was praying for me. XD

0Likes Log in to Reply
Post Author

odysseus

Posted March 20, 2025 at 8:36 pm

Does it run on the severed floor?

0Likes Log in to Reply
Post Author

zeroq

Posted March 20, 2025 at 8:36 pm

I call it the biggest bs since I had my supper.

0Likes Log in to Reply
Post Author

grej

Posted March 20, 2025 at 8:49 pm

It lasted until Mistral released 3.1 Small a week later. Such is the pace of AI…

0Likes Log in to Reply
Post Author

pram

Posted March 20, 2025 at 9:38 pm

Gemma 3 is a lot better at writing for sure, compared to 2, but the big improvement is I can actually use a 32k+ context window and not have it start flipping out with random garbage.

0Likes Log in to Reply
Post Author

williamDafoe

Posted March 20, 2025 at 10:10 pm

Does anyone use GoogleAI? For an AI Company with an AI Ceo using AI language translation, I think their actual GPT products are all terrible and have a terrible rep. And who wants their private conversation shipped back to google for spying?

0Likes Log in to Reply
Post Author

timmg

Posted March 20, 2025 at 10:12 pm

I'm wondering how small of a model can be "generally intelligent" (as in LLM intelligent, not AGI). Like there must be a size too small to hold "all the information" in.

And I also wonder at what point we'll see specialized small models. Like if I want help coding, it's probably ok if the model doesn't know who directed "Jaws". I suspect that is the future: many small, specialized models.

But maybe training compute will just get to the point where we can run a full-featured model on our desktop (or phone)?

0Likes Log in to Reply
Post Author

LeoPanthera

Posted March 20, 2025 at 10:20 pm

Maybe Llama 3.3 70B doesn't count as running on "one GPU", but it certainly runs just fine on one Mac, and in my tests it's far better at holding onto concepts over a longer conversation than Gemma 3 is, which starts getting confused after about 4000 tokens.

0Likes Log in to Reply
Post Author

m00x

Posted March 20, 2025 at 11:54 pm

I found Mistral Small 3.1, which released slightly after Gemma 3, much better.

Much fewer refusals, more accurate, less babbling, generally better, but especially at coding.

0Likes Log in to Reply
Post Author

pretoriusdre

Posted March 21, 2025 at 1:39 am

My instinct is that it would be cheaper overall to buy API credits when needed, compared with buying a top-of-the-line GPU which sits idle for most of the day. That also opens up access to larger models.

0Likes Log in to Reply

Google calls Gemma 3 the most powerful AI model you can run on one GPU by gmays

Google calls Gemma 3 the most powerful AI model you can run on one GPU by gmays

Share This Article

Newsletter

HackTech

15 Comments

CamperBob2

ForTheKidz

impure

RandyRanderson

ChrisArchitect

cwoolfe

odysseus

zeroq

grej

pram

williamDafoe

timmg

LeoPanthera

m00x

pretoriusdre

Leave a comment Cancel reply

Editor's Choice

Google calls Gemma 3 the most powerful AI model you can run on one GPU by gmays

Google calls Gemma 3 the most powerful AI model you can run on one GPU by gmays

Share This Article

Newsletter

15 Comments

Leave a comment Cancel reply

Editor's Choice

Sign Up to Our Newsletter