Skip to content Skip to footer
Gemini 2.5 Pro Preview by meetpateltech

Gemini 2.5 Pro Preview by meetpateltech

36 Comments

  • Post Author
    jeswin
    Posted May 6, 2025 at 3:25 pm

    Now if there was a way to add prepaid credits and monitor usage near real-time on a dashboard, like every other vendor. Hey Google are you listening?

  • Post Author
    ramesh31
    Posted May 6, 2025 at 3:27 pm

    >Best-in-class frontend web development

    It really is wild to have seen this happen over the last year. The days of traditional "design-to-code" FE work are completely over. I haven't written a line of HTML/CSS in months. If you are still doing this stuff by hand, you need to adapt fast. In conjunction with an agentic coding IDE and a few MCP tools, weeks worth of UI work are now done in hours to a higher level of quality and consistency with practically zero effort.

  • Post Author
    siwakotisaurav
    Posted May 6, 2025 at 3:28 pm

    Usually don’t believe the benchmarks but first in web dev arena specifically is crazy. That one has been Claude for so long, which tracks in my experience

  • Post Author
    ranyume
    Posted May 6, 2025 at 3:30 pm

    I don't know if I'm doing something wrong, but every time I ask gemini 2.5 for code it outputs SO MANY comments. An exaggerated amount of comments. Sections comments, step comments, block comments, inline comments, all the gang.

  • Post Author
    segphault
    Posted May 6, 2025 at 3:34 pm

    My frustration with using these models for programming in the past has largely been around their tendency to hallucinate APIs that simply don't exist. The Gemini 2.5 models, both pro and flash, seem significantly less susceptible to this than any other model I've tried.

    There are still significant limitations, no amount of prompting will get current models to approach abstraction and architecture the way a person does. But I'm finding that these Gemini models are finally able to replace searches and stackoverflow for a lot of my day-to-day programming.

  • Post Author
    xnx
    Posted May 6, 2025 at 3:34 pm

    This is much bigger news than OpenAI's acquisition of WindSurf.

  • Post Author
    herpdyderp
    Posted May 6, 2025 at 3:36 pm

    I agree it's very good but the UI is still usually an unusable, scroll-jacking disaster. I've found it's best to let a chat sit for around a few minutes after it has finished printing the AI's output. Finding the `ms-code-block` element in dev tools and logging `$0.textContext` is reliable too.

  • Post Author
    crat3r
    Posted May 6, 2025 at 3:38 pm

    So, are people using these tools without the org they work for knowing? The amount of hoops I would have to jump through to get either of the smaller companies I have worked for since the AI boom to let me use a tool like this would make it absolutely not worth the effort.

    I'm assuming large companies are mandating it, but ultimately the work that these LLMs seem poised for would benefit smaller companies most and I don't think they can really afford using them? Are people here paying for a personal subscription and then linking it to their work machines?

  • Post Author
    ionwake
    Posted May 6, 2025 at 3:40 pm

    Is it possible to sue this with Cursor? If so what is the name of the model? gemini-2.5-pro-preview ?

    edit> Its gemini-2.5-pro-preview-05-06

    edit>Cursor syas it doesnt have "good support" et, but im not sure if this is a defualt message when it doesnt recognise a model? is this a big deal? should I wait until its officially supported by cursor?

    Just trying to save time here for everyone – anyone know the answer?

  • Post Author
    xbmcuser
    Posted May 6, 2025 at 3:42 pm

    As a non programmer Gemini 2.5 Pro I have been really loving this for my python scripting for manipulating text and excel files for web scraping. In the past I was able to use Chat Gpt to code some of the things that I wanted but with Gemini 2.5 Pro it has been just another level. If they improved it further that would be amazing

  • Post Author
    djrj477dhsnv
    Posted May 6, 2025 at 3:42 pm

    I don't understand what I'm doing wrong.. it seems like everyone is saying Gemini is better, but I've compared dozens of examples from my work, and Grok has always produced better results.

  • Post Author
    white_beach
    Posted May 6, 2025 at 3:43 pm

    object?

    (aider joke)

  • Post Author
    llm_nerd
    Posted May 6, 2025 at 3:43 pm

    Their nomenclature is a bit confused. The Gemini web app has a 2.5 Pro (experimental), yet this apparently is referring to 2.5 Pro Preview 05-06.

    Would be ideal if they incremented the version number or the like.

  • Post Author
    martinald
    Posted May 6, 2025 at 3:43 pm

    I'm totally lost again! If I use Gemini on the website (gemini.google.com), am I using 2.5 Pro IO edition, or am I using the old one?

  • Post Author
    oellegaard
    Posted May 6, 2025 at 3:46 pm

    Is there anything like Claude code for other models such as gemini?

  • Post Author
    mliker
    Posted May 6, 2025 at 3:49 pm

    The "video to learning app" feature is a cool concept (see it in AI Studio). I just passed in two separate Stanford lectures to see if it could come up with an interesting interactive app. The apps it generated weren't too useful, but I can see with more focus and development, it'd be a game changer for education.

  • Post Author
    brap
    Posted May 6, 2025 at 3:51 pm

    Gemini is now ranked #1 across every category in lmarena.

  • Post Author
    killerstorm
    Posted May 6, 2025 at 3:51 pm

    Why can't they just use version numbers instead of this "new preview" stuff?

    E.g. call it Gemini Pro 2.5.1.

  • Post Author
    andy12_
    Posted May 6, 2025 at 3:55 pm

    Interestingly, when compering benchmarks of Experimental 03-25 [1] and Experimental 05-06 [2] it seems the new version scores slightly lower in everything except on LiveCodeBench.

    [1] https://storage.googleapis.com/model-cards/documents/gemini-…
    [2] https://deepmind.google/technologies/gemini/

  • Post Author
    thevillagechief
    Posted May 6, 2025 at 3:55 pm

    I've been switching between this and GPT-4o at work, and Gemini is really verbose. But I've been primarily using it. I'm confused though, the model available in copilot says Gemini 2.5 Pro (Preview), and I've had it for a few weeks. This was just released today. Is this an updated preview? If so, the blog/naming is confusing.

  • Post Author
    CSMastermind
    Posted May 6, 2025 at 3:57 pm

    Hasn't Gemini 2.5 Pro been out for a while?

    At first I was very impressed with it's coding abilities, switching off of Claud for it but recently I've been using GPT o3 which I find is much more concise and generally better at problem solving when you hit an error.

  • Post Author
    laborcontract
    Posted May 6, 2025 at 3:57 pm

    My guess is that they've done a lot of tuning to improve diff based code editing. Gemini 2.5 is fantastic at agentic work, but it still is pretty rough around the edges in terms of generating perfectly matching diffs to edit code. It's probably one of the very few issues with the model. Luckily, aider tracks this.

    They measure the old gemini 2.5 generating proper diffs 92% of the time. I bet this goes up to ~95-98% https://aider.chat/docs/leaderboards/

    Question for the google peeps who monitor these threads: Is gemini-2.5-pro-exp (free tier) updated as well, or will it go away?

    Also, in the blog post, it says:

      > The previous iteration (03-25) now points to the most recent version (05-06), so no action is required to use the improved model, and it continues to be available at the same price.
    

    Does this mean gemini-2.5-pro-preview-03-25 now uses 05-06? Does the same apply to gemini-2.5-pro-exp-03-25?

    update: I just tried updating the date in the exp model (gemini-2.5-pro-exp-05-06) and that doesnt work.

  • Post Author
    EliasWatson
    Posted May 6, 2025 at 4:00 pm

    I wonder how the latest version of Grok 3 would stack up to Gemini 2.5 Pro on the web dev arena leaderboard. They are still just showing the original early access model for some reason, despite there being API access to the latest model. I've been using Grok 3 with Aider Chat and have been very impressed with it. I get $150 of free API credits every month by allowing them to train on my data, which I'm fine with since I'm just working on personal side projects. Gemini 2.5 Pro and Claude 3.7 might be a little better than Grok 3, but I can't justify the cost when Grok doesn't cost me a penny to use.

  • Post Author
    mohsen1
    Posted May 6, 2025 at 4:00 pm

    I use Gemini for almost everything. But their model card[1] only compares to o3-mini! In known benchmarks o3 is still ahead:

            +------------------------------+---------+--------------+
            |         Benchmark            |   o3    | Gemini 2.5   |
            |                              |         |    Pro       |
            +------------------------------+---------+--------------+
            | ARC-AGI (High Compute)       |  87.5%  |     —        |
            | GPQA Diamond (Science)       |  87.7%  |   84.0%      |
            | AIME 2024 (Math)             |  96.7%  |   92.0%      |
            | SWE-bench Verified (Coding)  |  71.7%  |   63.8%      |
            | Codeforces Elo Rating        |  2727   |     —        |
            | MMMU (Visual Reasoning)      |  82.9%  |   81.7%      |
            | MathVista (Visual Math)      |  86.8%  |     —        |
            | Humanity’s Last Exam         |  26.6%  |   18.8%      |
            +------------------------------+---------+--------------+
    

    [1] https://storage.googleapis.com/model-cards/documents/gemini-…

  • Post Author
    gitroom
    Posted May 6, 2025 at 4:02 pm

    man that endless commenting seriously kills my flow – gotta say, even after all the prompts and hacks, still can't get these models to chill out. you think we'll ever get ai to stop overdoing it and actually fit real developer habits or is it always gonna be like this?

  • Post Author
    arnaudsm
    Posted May 6, 2025 at 4:10 pm

    Be careful, this model is worse than 03-25 in 10 of the 12 benchmarks (!)

    I bet they kept training on coding, made everything worse on the way, and tried to hide it under the rug because of the sunk costs.

  • Post Author
    nashashmi
    Posted May 6, 2025 at 4:18 pm

    I keep hearing good things about Gemini online and offline. I wrote them off as terrible when they first launched and have not looked back since.

    How are they now? Sufficiently good? Competent? Competitive? Or limited? My needs are very consumer oriented, not programming/api stuff.

  • Post Author
    ramoz
    Posted May 6, 2025 at 4:26 pm

    Never sleep on Google.

  • Post Author
    panarchy
    Posted May 6, 2025 at 4:54 pm

    Is it just me that finds that while Gemini 2.5 is able to generate a lot of code that the end results are usually lackluster compared to Claude and even ChatGPT? I also find it hard-headed and frequently does things in ways I explicitly told it not to. The massive context window is pretty great though and enables me to do things I can't with the others so it still gets used a lot.

  • Post Author
    xyst
    Posted May 6, 2025 at 4:55 pm

    Proprietary junk beats DeepSeek by a mere 213 points?

    Oof. G and others are way behind

  • Post Author
    childintime
    Posted May 6, 2025 at 4:57 pm

    How does it perform on anything but Python and Javascript? In my experience my milage varied a lot when using C#, for example, or Zig, so I've learnt to just let it select the language it wants.

    Also, why doesn't Ctrl+C work??

  • Post Author
    obsolete_wagie
    Posted May 6, 2025 at 5:14 pm

    o3 is so far ahead of antrhopic and google, these models arent even worth using

  • Post Author
    ionwake
    Posted May 6, 2025 at 5:16 pm

    Can someone tell me if windsurf is better than cursor? ( pref someone who has used both for a few days? )

  • Post Author
    m_kos
    Posted May 6, 2025 at 5:33 pm

    [Tangent] Anyone here using 2.5 Pro in Gemini Advanced? I have been experiencing a ton of bugs, e.g.,:

    – [codes] showing up instead of references,

    – raw search tool output sliding across the screen,

    – Gemini continusly answering questions asked two or more messages before but ignoring the most recent one (you need to ask Gemini an unrelated question for it to snap out of this bug for a few minutes),

    – weird messages including text irrelevant to any of my chats with Gemini, like baseball,

    – confusing its own replies with mine,

    – not being able to run its own Python code due to some unsolvable formatting issue,

    – timeouts, and more.

  • Post Author
    paulirish
    Posted May 6, 2025 at 5:54 pm

    > Gemini 2.5 Pro now ranks #1 on the WebDev Arena leaderboard

    It'd make sense to rename WebDev Arena to React/Tailwind Arena. Its system prompt requires [1] those technologies and the entire tool breaks when requesting vanilla JS or other frameworks. The second-order implications of models competing on this narrow definition of webdev are rather troublesome.

    [1] https://blog.lmarena.ai/blog/2025/webdev-arena/#:~:text=PROM…

  • Post Author
    qwertox
    Posted May 6, 2025 at 6:29 pm

    I have my issues with the code Gemini Pro in AI Studio generates without customized "System Instructions".

    It turns a well readable code-snipped of 5 lines into a 30 line snippet full of comments and mostly unnecessary error handling. Code which becomes harder to reason about.

    But for Sysadmin tasks, like dealing with ZFS and LVM, it is absolutely incredible.

Leave a comment

In the Shadows of Innovation”

© 2025 HackTech.info. All Rights Reserved.

Sign Up to Our Newsletter

Be the first to know the latest updates

Whoops, you're not connected to Mailchimp. You need to enter a valid Mailchimp API key.