Skip to content Skip to footer
Gemini 2.5 Flash by meetpateltech

Gemini 2.5 Flash by meetpateltech

30 Comments

  • Post Author
    xnx
    Posted April 17, 2025 at 7:22 pm

    50% price increase from Gemini 2.0 Flash. That sounds like a lot, but Flash is still so cheap when compared to other models of this (or lesser) quality. https://developers.googleblog.com/en/start-building-with-gem…

  • Post Author
    byefruit
    Posted April 17, 2025 at 7:27 pm

    It's interesting that there's a price nearly 6x price difference between reasoning and no reasoning.

    This implies it's not a hybrid model that can just skip reasoning steps if requested.

    Anyone know what else they might be doing?

    Reasoning means contexts will be longer (for thinking tokens) and there's an increase in cost to inference with a longer context but it's not going to be 6x.

    Or is it just market pricing?

  • Post Author
    punkpeye
    Posted April 17, 2025 at 7:32 pm

    This is cool, but rate limits on all of these preview models are PITA

  • Post Author
    arnaudsm
    Posted April 17, 2025 at 7:32 pm

    Gemini flash models have the least hype, but in my experience in production have the best bang for the buck and multimodal tooling.

    Google is silently winning the AI race.

  • Post Author
    transformi
    Posted April 17, 2025 at 7:34 pm

    Bad day is going on google.

    First the decleration of illegal monopoly..

    and now… Google’s latest innovation: programmable overthinking.

    With Gemini 2.5 Flash, you too can now set a thinking_budget—because nothing says "state-of-the-art AI" like manually capping how long it’s allowed to reason. Truly the dream: debugging a production outage at 2am wondering if your LLM didn’t answer correctly because you cheaped out on tokens. lol.

    “Turn thinking off for better performance.” That’s not a model config, that’s a metaphor for Google’s entire AI strategy lately.

    At this point, Gemini isn’t an AI product—it’s a latency-cost-quality compromise simulator with a text interface. Meanwhile, OpenAI and Anthropic are out here just… cooking the benchmarks

  • Post Author
    hmaxwell
    Posted April 17, 2025 at 7:36 pm

    I did some testing this morning:

    Prompt: "can you find any mistakes on my codebase? I put one in there on purpose" + 70,000 tokens of codebase where in one line I have an include for a non-existent file.

    Claude 3.7 Sonnet (Try 1 – new chat) Claimed it Found a race condition in the api of the admin interface that would be triggered if two admins were to change the room order at the same time. Claude suggested I group all sql queries in a single transaction. I looked at the code and found that it already used a transaction for all queries. I said: the order_update api is already done with a transaction. Claude replied: "You're absolutely right, and I apologize for my mistake. I was incorrect to claim there was a race condition issue. The transaction ensures atomicity and consistency of the updates, and the SQL queries are properly structured for their intended purpose."

    Claude 3.7 Sonnet (Try 2 – new chat) Found a comment in the admin ui javascript code that suggested a potential alternative to event handler cleanup that was not implemented because I decided to go with a cleaner route. Then asked "Is this the issue you intentionally included, or would you like me to look for other potential problems?" I said: "The comment merely suggests an alternative, right?" claude said: "Yes, you're absolutely right. The comment is merely suggesting an alternative approach that isn't being used in the code, rather than indicating a mistake. So there's no actual bug or mistake in this part of the code – just documentation of different possible approaches. I apologize for misinterpreting this as an issue!"

    Claude 3.7 Sonnet (Try 3 – new chat) When processing items out of the database to generate QR codes in the admin interface, Claude says that my code both attempts to generate QR codes with undefined data AS WELL AS saying that my error handling skips undefined data. Claude contradicts itself within 2 sentences. When asking about clarification Claude replies: Looking at the code more carefully, I see that the code actually has proper error handling. I incorrectly stated that it "still attempts to call generateQRCode()" in the first part of my analysis, which was wrong. The code properly handles the case when there's no data-room attribute.

    Gemnini Advanced 2.5 Pro (Try 1 – new chat) Found the intentional error and said I should stop putting db creds/api keys into the codebase.

    Gemnini Advanced 2.5 Pro (Try 2 – new chat) Found the intentional error and said I should stop putting db creds/api keys into the codebase.

    Gemnini Advanced 2.5 Pro (Try 3 – new chat) Found the intentional error and said I should stop putting db creds/api keys into the codebase.

    o4-mini-high and o4-mini and o3 and 4.5 and 4o – "The message you submitted was too long, please reload the conversation and submit something shorter."

  • Post Author
    Workaccount2
    Posted April 17, 2025 at 7:36 pm

    OpenAI might win the college students but it looks like Google will lock in enterprise.

  • Post Author
    statements
    Posted April 17, 2025 at 7:37 pm

    Interesting to note that this might be the only model with knowledge cut off as recent as 2025 January

  • Post Author
    ein0p
    Posted April 17, 2025 at 7:39 pm

    Absolutely decimated on metrics by o4-mini, straight out of the gate, and not even that much cheaper on output tokens (o4-mini's thinking can't be turned off IIRC).

  • Post Author
    xbmcuser
    Posted April 17, 2025 at 7:55 pm

    For a non programmer like me google is becoming shockingly good. It is giving working code the first time. I was playing around with it asked it to write code to scrape some data of a website to analyse. I was expecting it to write something that would scrape the data and later I would upload the data to it to analyse. But it actually wrote code that scraped and analysed the data. It was basic categorizing and counting of the data but I was not expecting it to do that.

  • Post Author
    __alexs
    Posted April 17, 2025 at 8:02 pm

    Does billing for the API actually work properly yet?

  • Post Author
    alecco
    Posted April 17, 2025 at 8:05 pm

    Gemini models are very good but in my experience they tend to overdo the problems. When I give it things for context and something to rework, Gemini often reworks the problem.

    For software it is barely useful because you want small commits for specific fixes not a whole refactor/rewrite. I tried many prompts but it's hard. Even when I give it function signatures of the APIs the code I want to fix uses, Gemini rewrites the API functions.

    If anybody knows a prompt hack to avoid this, I'm all ears. Meanwhile I'm staying with Claude Pro.

  • Post Author
    ks2048
    Posted April 17, 2025 at 8:10 pm

    If this announcement is targeting people not up-to-date on the models available, I think they should say what "flash" means. Is there a "Gemini (non-flash)"?

    I see the 4 Google model names in the chart here. Are these 4 the main "families" of models to choose from?

    – Gemini-Pro-Preview

    – Gemini-Flash-Preview

    – Gemini-Flash

    – Gemini-Flash-Lite

  • Post Author
    AStonesThrow
    Posted April 17, 2025 at 8:14 pm

    I've been leveraging the services of 3 LLMs, mainly: Meta, Gemini, and Copilot.

    It depends on what I'm asking. If I'm looking for answers in the realm of history or culture, religion, or I want something creative such as a cute limerick, or a song or dramatic script, I'll ask Copilot. Currently, Copilot has two modes: "Quick Answer"; or "Think Deeply", if you want to wait about 30 seconds for a good answer.

    If I want info on a product, a business, an industry or a field of employment, or on education, technology, etc., I'll inquire of Gemini.

    Both Copilot and Gemini have interactive voice conversation modes. Thankfully, they will also write a transcript of what we said. They also eagerly attempt to engage the user with further questions and followups, with open questions such as "so what's on your mind tonight?"

    And if I want to know about pop stars, film actors, the social world or something related to tourism or recreation in general, I can ask Meta's AI through [Facebook] Messenger.

    One thing I found to be extremely helpful and accurate was Gemini's tax advice. I mean, it was way better than human beings at the entry/poverty level. Commercial tax advisors, even when I'd paid for the Premium Deluxe Tax Software from the Biggest Name, they just went to Google stuff for me. I mean, they didn't even seem to know where stuff was on irs.gov. When I asked for a virtual or phone appointment, they were no-shows, with a litany of excuses. I visited 3 offices in person; the first two were closed, and the third one basically served Navajos living off the reservation.

    So when I asked Gemini about tax information — simple stuff like the terminology, definitions, categories of income, and things like that — Gemini was perfectly capable of giving lucid answers. And citing its sources, so I could immediately go find the IRS.GOV publication and read it "from the horse's mouth".

    Oftentimes I'll ask an LLM just to jog my memory or inform me of what specific terminology I should use. Like "Hey Gemini, what's the PDU for Ethernet called?" and when Gemini says it's a "frame" then I have that search term I can plug into Wikipedia for further research. Or, for an introduction or overview to topics I'm unfamiliar with.

    LLMs are an important evolutionary step in the general-purpose "search engine" industry. One problem was, you see, that it was dangerous, annoying, or risky to go Googling around and click on all those tempting sites. Google knew this: the dot-com sites and all the SEO sites that surfaced to the top were traps, they were bait, they were sometimes legitimate scams. So the LLM providers are showing us that we can stay safe in a sandbox, without clicking external links, without coughing up information about our interests and setting cookies and revealing our IPv6 addresses: we can safely ask a local LLM, or an LLM in a trusted service provider, about whatever piques our fancy. And I am glad for this. I saw y'all complaining about how every search engine was worthless, and the Internet was clogged with blogspam, and there was no real information anymore. Well, perhaps LLMs, for now, are a safe space, a sandbox to play in, where I don't need to worry about drive-by-zero-click malware, or being inundated with Joomla ads, or popups. For now.

  • Post Author
    cynicalpeace
    Posted April 17, 2025 at 8:18 pm

    1. The main transformative aspect of LLMs has been in writing code.

    2. LLMs have had less transformative aspects in 2025 than we anticipated back in late 2022.

    3. LLMs are unlikely to be very transformative to society, even as their intelligence increases, because intelligence is a minor changemaker in society. Bigger changemakers are motivation, courage, desire, taste, power, sex and hunger.

    4. LLMs are unlikely to develop these more important traits because they are trained on text, not evolved in a rigamarole of ecological challenges.

  • Post Author
    charcircuit
    Posted April 17, 2025 at 8:23 pm

    500 RPD for the free tier is good enough for my coding needs. Nice.

  • Post Author
    AbuAssar
    Posted April 17, 2025 at 8:37 pm

    I noticed that OpenAI don't compare their models to third party models in their announcement posts, unlike google, meta and the others.

  • Post Author
    mmaunder
    Posted April 17, 2025 at 8:37 pm

    More great innovation from Google. OpenAI have two major problems.

    The first is Google's vertically integrated chip pipeline and deep supply chain and operational knowledge when it comes to creating AI chips and putting them into production. They have a massive cost advantage at every step. This translates into more free services, cheaper paid services, more capabilities due to more affordable compute, and far more growth.

    Second problem is data starvation and the unfair advantage that social media has when it comes to a source of continually refreshed knowledge. Now that the foundational model providers have churned through the common crawl and are competing to consume things like video and whatever is left, new data is becoming increasingly valuable as a differentiator, and more importantly, as a provider of sustained value for years to come.

    SamA has signaled both of these problems when he made noises about building a fab a while back and is more recently making noises about launching a social media platform off OpenAI. The smart money among his investors know these issues to be fundamental in deciding if OAI will succeed or not, and are asking the hard questions.

    If the only answer for both is "we'll build it from scratch", OpenAI is in very big trouble. And it seems that that is the best answer that SamA can come up with. I continue to believe that OpenAI will be the Netscape of the AI revolution.

    The win is Google's for the taking, if they can get out of their own way.

  • Post Author
    mark_l_watson
    Posted April 17, 2025 at 8:39 pm

    Nice! Low price, even with reasoning enabled. I have been working on a short new book titled “Practical AI with Google: A Solo Knowledge Worker's Guide to Gemini, AI Studio, and LLM APIs” but with all of Google’s recent announcements it might not be a short book.

  • Post Author
    serjester
    Posted April 17, 2025 at 9:00 pm

    Just ran it on one of our internal PDF (3 pages, medium difficulty) to json benchmarks:

    gemini-flash-2.0:
    60 ish% accuracy
    6,250 pages per dollar

    gemini-2.5-flash-preview (no thinking):
    80 ish% accuracy
    1,700 pages per dollar

    gemini-2.5-flash-preview (with thinking):
    80 ish% accuracy (not sure what's going on here)
    350 pages per dollar

    gemini-flash-2.5:
    90 ish% accuracy
    150 pages per dollar

    I do wish they separated the thinking variant from the regular one – it's incredibly confusing when a model parameter dramatically impacts pricing.

  • Post Author
    zoogeny
    Posted April 17, 2025 at 9:02 pm

    Google making Gemini 2.5 Pro (Experimental) free was a big deal. I haven't tried the more expensive OpenAI models so I can't even compare, only to the free models I have used of theirs in the past.

    Gemini 2.5 Pro is so much of a step up (IME) that I've become sold on Google's models in general. It not only is smarter than me on most of the subjects I engage with it, it also isn't completely obsequious. The model pushes back on me rather than contorting itself to find a way to agree.

    100% of my casual AI usage is now in Gemini and I look forward to asking it questions on deep topics because it consistently provides me with insight. I am building new tools with the mind to optimize my usage to increase it's value to me.

  • Post Author
    minimaxir
    Posted April 17, 2025 at 9:08 pm

    One hidden note from Gemini 2.5 Flash when diving deep into the documentation: for image inputs, not only can the model be instructed to generated 2D bounding boxes of relevant subjects, but it can also create segmentation masks! https://ai.google.dev/gemini-api/docs/image-understanding#se…

    At this price point with the Flash model, creating segmentation masks is pretty nifty.

    The segmentation masks are a bit of a galaxy brain implementation by generating a b64 string representing the mask: https://colab.research.google.com/github/google-gemini/cookb…

    I am trying to test it in AI Studio but it sometimes errors out, likely because it tries to decode the b64 lol.

  • Post Author
    simonw
    Posted April 17, 2025 at 9:19 pm

    I spotted something interesting in the Python API library code:

    https://github.com/googleapis/python-genai/blob/473bf4b6b5a6…

      class ThinkingConfig(_common.BaseModel):
          """The thinking features configuration."""
       
          include_thoughts: Optional[bool] = Field(
              default=None,
              description="""Indicates whether to include thoughts in the response. If true, thoughts are returned only if the model supports thought and thoughts are available.
            """,
          )
          thinking_budget: Optional[int] = Field(
              default=None,
              description="""Indicates the thinking budget in tokens.
              """,
          )
    

    That thinking_budget thing is documented, but what's the deal with include_thoughts? It sounds like it's an option to have the API return the thought summary… but I can't figure out how to get it to work, and I've not found documentation or example code that uses it.

    Anyone managed to get Gemini to spit out thought summaries in its API using this option?

  • Post Author
    deanmoriarty
    Posted April 17, 2025 at 9:36 pm

    Genuine naive question: when it comes to Google HN has generally a negative view of it (pick any random story on Chrome, ads, search, web, working at faang, etc. and this should be obvious from the comments), yet when it comes to AI there is a somewhat notable “cheering effect” for Google to win the AI race that goes beyond a conventional appreciation of a healthy competitive landscape, which may appear as a bit of a double standard.

    Why is this? Is it because OpenAI is seen as such a negative player in this ecosystem that Google “gets a pass on this one”?

    And bonus question: what do people think will happen to OpenAI if Google wins the race? Do you think they’ll literally just go bust?

  • Post Author
    krembo
    Posted April 17, 2025 at 10:17 pm

    How is this sustainable for Google from business POV? It feels like Google is shooting itself in the foot while "winning" the AI race.. From my experience I think Google lost 99% of the ads it used to show me before in the search engine.

  • Post Author
    zenGull
    Posted April 17, 2025 at 10:31 pm

    [dead]

  • Post Author
    jdthedisciple
    Posted April 17, 2025 at 10:41 pm

    Very excited to try it, but it is noteworthy that o4-mini is strictly better according to the very benchmarks shown by Google here.

    Of course it's about 4x as expensive too (I believe), but still, given the release of openai/codex as well, o4-mini will remain a strong competitor for now.

  • Post Author
    thimabi
    Posted April 17, 2025 at 11:00 pm

    I find it baffling that Google offers such impressive models through the API and even the free AI Studio with fine-grained control, yet the models used in the Gemini app feel much worse.

    Over the past few weeks, I’ve been using Gemini Advanced on my Workspace account. There, the models think for shorter times, provide shorter outputs, and even their context window is far from the advertised 1 million tokens. It makes me think that Google is intentionally limiting the Gemini app.

    Perhaps the goal is to steer users toward the API or AI Studio, with the free tier that involves data collection for training purposes.

  • Post Author
    bingdig
    Posted April 17, 2025 at 11:23 pm

    It appears that this impacted gemini-2.5-pro-preview-03-25 somehow? grounding with google search no longer works.

    I had a workflow running that would pull news articles from the past 24 hours. It now refuses to believe the current date is 2025-04-17. Even with search turned on and I ask it what the date is it and it always replies sometime in July 2024.

  • Post Author
    Alifatisk
    Posted April 17, 2025 at 11:36 pm

    No matter how good the new Gemini models have become, my bad experience with early Gemini is still stuck with me and I am afraid I still suffer from confirmation bias. Whenever I just look at the Gemini app, I already assume it’s going to be a bad experience.

Leave a comment

In the Shadows of Innovation”

© 2025 HackTech.info. All Rights Reserved.

Sign Up to Our Newsletter

Be the first to know the latest updates

Whoops, you're not connected to Mailchimp. You need to enter a valid Mailchimp API key.