Skip to content Skip to footer
0 items - $0.00 0

Google calls Gemma 3 the most powerful AI model you can run on one GPU by gmays

Google calls Gemma 3 the most powerful AI model you can run on one GPU by gmays

Google calls Gemma 3 the most powerful AI model you can run on one GPU by gmays

15 Comments

  • Post Author
    CamperBob2
    Posted March 20, 2025 at 7:40 pm

    Technically, the 1.58-bit Unsloth quant of DeepSeek R1 runs on a single GPU+128GB of system RAM. It performs amazingly well, but you'd better not be in a hurry.

  • Post Author
    ForTheKidz
    Posted March 20, 2025 at 7:52 pm

    …until this coming tuesday? …let's talk value.

    EDIT: I do feel like a fool, thank you.

  • Post Author
    impure
    Posted March 20, 2025 at 7:58 pm

    It’s a 27B model, I highly doubt that.

  • Post Author
    RandyRanderson
    Posted March 20, 2025 at 8:09 pm

    So says Gemma 3.

  • Post Author
    ChrisArchitect
    Posted March 20, 2025 at 8:17 pm
  • Post Author
    cwoolfe
    Posted March 20, 2025 at 8:27 pm

    Apparently it can also pray. Seriously, I asked it for biblical advice about a tough situation today and it said it was praying for me. XD

  • Post Author
    odysseus
    Posted March 20, 2025 at 8:36 pm

    Does it run on the severed floor?

  • Post Author
    zeroq
    Posted March 20, 2025 at 8:36 pm

    I call it the biggest bs since I had my supper.

  • Post Author
    grej
    Posted March 20, 2025 at 8:49 pm

    It lasted until Mistral released 3.1 Small a week later. Such is the pace of AI…

  • Post Author
    pram
    Posted March 20, 2025 at 9:38 pm

    Gemma 3 is a lot better at writing for sure, compared to 2, but the big improvement is I can actually use a 32k+ context window and not have it start flipping out with random garbage.

  • Post Author
    williamDafoe
    Posted March 20, 2025 at 10:10 pm

    Does anyone use GoogleAI? For an AI Company with an AI Ceo using AI language translation, I think their actual GPT products are all terrible and have a terrible rep. And who wants their private conversation shipped back to google for spying?

  • Post Author
    timmg
    Posted March 20, 2025 at 10:12 pm

    I'm wondering how small of a model can be "generally intelligent" (as in LLM intelligent, not AGI). Like there must be a size too small to hold "all the information" in.

    And I also wonder at what point we'll see specialized small models. Like if I want help coding, it's probably ok if the model doesn't know who directed "Jaws". I suspect that is the future: many small, specialized models.

    But maybe training compute will just get to the point where we can run a full-featured model on our desktop (or phone)?

  • Post Author
    LeoPanthera
    Posted March 20, 2025 at 10:20 pm

    Maybe Llama 3.3 70B doesn't count as running on "one GPU", but it certainly runs just fine on one Mac, and in my tests it's far better at holding onto concepts over a longer conversation than Gemma 3 is, which starts getting confused after about 4000 tokens.

  • Post Author
    m00x
    Posted March 20, 2025 at 11:54 pm

    I found Mistral Small 3.1, which released slightly after Gemma 3, much better.

    Much fewer refusals, more accurate, less babbling, generally better, but especially at coding.

  • Post Author
    pretoriusdre
    Posted March 21, 2025 at 1:39 am

    My instinct is that it would be cheaper overall to buy API credits when needed, compared with buying a top-of-the-line GPU which sits idle for most of the day. That also opens up access to larger models.

Leave a comment

In the Shadows of Innovation”

© 2025 HackTech.info. All Rights Reserved.

Sign Up to Our Newsletter

Be the first to know the latest updates

Whoops, you're not connected to Mailchimp. You need to enter a valid Mailchimp API key.