Gemini 2.5 Pro Preview by meetpateltech

ByHackTech May 6, 2025

36Comments

Share This Article

Sed ut perspiciatis unde.

Send to HN

Logan Kilpatrick

Senior Product Manager

Gemini API and Google AI Studio

We’ve seen developers doing amazing things with Gemini 2.5 Pro, so we decided to release an updated version a couple of weeks early to get into developers hands sooner.

Today we’re excited to release Gemini 2.5 Pro Preview (I/O edition). This update features even stronger coding capabilities, for you to start building with before Google I/O later this month. Expect meaningful improvements for front-end and UI development, alongside improvements in fundamental coding tasks such as transforming and editing code, and creating sophisticated agentic workflows.

^{“We found Gemini 2.5 Pro to be the best frontier model when it comes to “capability over latency” ratio. I look forward to rolling it out on Replit Agent whenever a latency-sensitive task needs to be accomplished with a high degree of reliability.”
–} ^{Michele Catasta, President,} ^Replit

Best-in-class frontend web development

Gemini 2.5 Pro now ranks #1 on the WebDev Arena leaderboard, which measures human preference for a model’s ability to build aesthetically pleasing and functional web apps. Drawing on this leading capability, Gemini 2.5 Pro powers Cursor’s innovative code agent and empowers our collaborations with companies like Cognition and Replit. Together, we’re pushing the frontiers of agentic programming to unlock new possibilities for developers.

^{“The updated Gemini 2.5 Pro achieves leading perfor}

0Likes

Written by

HackTech

View all posts by HackTech

Show comments (36)

36 Comments

Post Author

jeswin

Posted May 6, 2025 at 3:25 pm

Now if there was a way to add prepaid credits and monitor usage near real-time on a dashboard, like every other vendor. Hey Google are you listening?

0Likes Log in to Reply
Post Author

ramesh31

Posted May 6, 2025 at 3:27 pm

>Best-in-class frontend web development

It really is wild to have seen this happen over the last year. The days of traditional "design-to-code" FE work are completely over. I haven't written a line of HTML/CSS in months. If you are still doing this stuff by hand, you need to adapt fast. In conjunction with an agentic coding IDE and a few MCP tools, weeks worth of UI work are now done in hours to a higher level of quality and consistency with practically zero effort.

0Likes Log in to Reply
Post Author

siwakotisaurav

Posted May 6, 2025 at 3:28 pm

Usually don’t believe the benchmarks but first in web dev arena specifically is crazy. That one has been Claude for so long, which tracks in my experience

0Likes Log in to Reply
Post Author

ranyume

Posted May 6, 2025 at 3:30 pm

I don't know if I'm doing something wrong, but every time I ask gemini 2.5 for code it outputs SO MANY comments. An exaggerated amount of comments. Sections comments, step comments, block comments, inline comments, all the gang.

0Likes Log in to Reply
Post Author

segphault

Posted May 6, 2025 at 3:34 pm

My frustration with using these models for programming in the past has largely been around their tendency to hallucinate APIs that simply don't exist. The Gemini 2.5 models, both pro and flash, seem significantly less susceptible to this than any other model I've tried.

There are still significant limitations, no amount of prompting will get current models to approach abstraction and architecture the way a person does. But I'm finding that these Gemini models are finally able to replace searches and stackoverflow for a lot of my day-to-day programming.

0Likes Log in to Reply
Post Author

xnx

Posted May 6, 2025 at 3:34 pm

This is much bigger news than OpenAI's acquisition of WindSurf.

0Likes Log in to Reply
Post Author

herpdyderp

Posted May 6, 2025 at 3:36 pm

I agree it's very good but the UI is still usually an unusable, scroll-jacking disaster. I've found it's best to let a chat sit for around a few minutes after it has finished printing the AI's output. Finding the `ms-code-block` element in dev tools and logging `$0.textContext` is reliable too.

0Likes Log in to Reply
Post Author

crat3r

Posted May 6, 2025 at 3:38 pm

So, are people using these tools without the org they work for knowing? The amount of hoops I would have to jump through to get either of the smaller companies I have worked for since the AI boom to let me use a tool like this would make it absolutely not worth the effort.

I'm assuming large companies are mandating it, but ultimately the work that these LLMs seem poised for would benefit smaller companies most and I don't think they can really afford using them? Are people here paying for a personal subscription and then linking it to their work machines?

0Likes Log in to Reply
Post Author

ionwake

Posted May 6, 2025 at 3:40 pm

Is it possible to sue this with Cursor? If so what is the name of the model? gemini-2.5-pro-preview ?

edit> Its gemini-2.5-pro-preview-05-06

edit>Cursor syas it doesnt have "good support" et, but im not sure if this is a defualt message when it doesnt recognise a model? is this a big deal? should I wait until its officially supported by cursor?

Just trying to save time here for everyone – anyone know the answer?

0Likes Log in to Reply
Post Author

xbmcuser

Posted May 6, 2025 at 3:42 pm

As a non programmer Gemini 2.5 Pro I have been really loving this for my python scripting for manipulating text and excel files for web scraping. In the past I was able to use Chat Gpt to code some of the things that I wanted but with Gemini 2.5 Pro it has been just another level. If they improved it further that would be amazing

0Likes Log in to Reply
Post Author

djrj477dhsnv

Posted May 6, 2025 at 3:42 pm

I don't understand what I'm doing wrong.. it seems like everyone is saying Gemini is better, but I've compared dozens of examples from my work, and Grok has always produced better results.

0Likes Log in to Reply
Post Author

white_beach

Posted May 6, 2025 at 3:43 pm

object?

(aider joke)

0Likes Log in to Reply
Post Author

llm_nerd

Posted May 6, 2025 at 3:43 pm

Their nomenclature is a bit confused. The Gemini web app has a 2.5 Pro (experimental), yet this apparently is referring to 2.5 Pro Preview 05-06.

Would be ideal if they incremented the version number or the like.

0Likes Log in to Reply
Post Author

martinald

Posted May 6, 2025 at 3:43 pm

I'm totally lost again! If I use Gemini on the website (gemini.google.com), am I using 2.5 Pro IO edition, or am I using the old one?

0Likes Log in to Reply
Post Author

oellegaard

Posted May 6, 2025 at 3:46 pm

Is there anything like Claude code for other models such as gemini?

0Likes Log in to Reply
Post Author

mliker

Posted May 6, 2025 at 3:49 pm

The "video to learning app" feature is a cool concept (see it in AI Studio). I just passed in two separate Stanford lectures to see if it could come up with an interesting interactive app. The apps it generated weren't too useful, but I can see with more focus and development, it'd be a game changer for education.

0Likes Log in to Reply
Post Author

brap

Posted May 6, 2025 at 3:51 pm

Gemini is now ranked #1 across every category in lmarena.

0Likes Log in to Reply
Post Author

killerstorm

Posted May 6, 2025 at 3:51 pm

Why can't they just use version numbers instead of this "new preview" stuff?

E.g. call it Gemini Pro 2.5.1.

0Likes Log in to Reply
Post Author

andy12_

Posted May 6, 2025 at 3:55 pm

Interestingly, when compering benchmarks of Experimental 03-25 [1] and Experimental 05-06 [2] it seems the new version scores slightly lower in everything except on LiveCodeBench.

[1] https://storage.googleapis.com/model-cards/documents/gemini-…
[2] https://deepmind.google/technologies/gemini/

0Likes Log in to Reply
Post Author

thevillagechief

Posted May 6, 2025 at 3:55 pm

I've been switching between this and GPT-4o at work, and Gemini is really verbose. But I've been primarily using it. I'm confused though, the model available in copilot says Gemini 2.5 Pro (Preview), and I've had it for a few weeks. This was just released today. Is this an updated preview? If so, the blog/naming is confusing.

0Likes Log in to Reply
Post Author

CSMastermind

Posted May 6, 2025 at 3:57 pm

Hasn't Gemini 2.5 Pro been out for a while?

At first I was very impressed with it's coding abilities, switching off of Claud for it but recently I've been using GPT o3 which I find is much more concise and generally better at problem solving when you hit an error.

0Likes Log in to Reply
Post Author

laborcontract

Posted May 6, 2025 at 3:57 pm
My guess is that they've done a lot of tuning to improve diff based code editing. Gemini 2.5 is fantastic at agentic work, but it still is pretty rough around the edges in terms of generating perfectly matching diffs to edit code. It's probably one of the very few issues with the model. Luckily, aider tracks this.

They measure the old gemini 2.5 generating proper diffs 92% of the time. I bet this goes up to ~95-98% https://aider.chat/docs/leaderboards/

Question for the google peeps who monitor these threads: Is gemini-2.5-pro-exp (free tier) updated as well, or will it go away?

Also, in the blog post, it says:

> The previous iteration (03-25) now points to the most recent version (05-06), so no action is required to use the improved model, and it continues to be available at the same price.

Does this mean gemini-2.5-pro-preview-03-25 now uses 05-06? Does the same apply to gemini-2.5-pro-exp-03-25?

update: I just tried updating the date in the exp model (gemini-2.5-pro-exp-05-06) and that doesnt work.
0Likes Log in to Reply
Post Author

EliasWatson

Posted May 6, 2025 at 4:00 pm

I wonder how the latest version of Grok 3 would stack up to Gemini 2.5 Pro on the web dev arena leaderboard. They are still just showing the original early access model for some reason, despite there being API access to the latest model. I've been using Grok 3 with Aider Chat and have been very impressed with it. I get $150 of free API credits every month by allowing them to train on my data, which I'm fine with since I'm just working on personal side projects. Gemini 2.5 Pro and Claude 3.7 might be a little better than Grok 3, but I can't justify the cost when Grok doesn't cost me a penny to use.

0Likes Log in to Reply

Post Author

mohsen1

Posted May 6, 2025 at 4:00 pm

I use Gemini for almost everything. But their model card[1] only compares to o3-mini! In known benchmarks o3 is still ahead:

        +------------------------------+---------+--------------+
        |         Benchmark            |   o3    | Gemini 2.5   |
        |                              |         |    Pro       |
        +------------------------------+---------+--------------+
        | ARC-AGI (High Compute)       |  87.5%  |     —        |
        | GPQA Diamond (Science)       |  87.7%  |   84.0%      |
        | AIME 2024 (Math)             |  96.7%  |   92.0%      |
        | SWE-bench Verified (Coding)  |  71.7%  |   63.8%      |
        | Codeforces Elo Rating        |  2727   |     —        |
        | MMMU (Visual Reasoning)      |  82.9%  |   81.7%      |
        | MathVista (Visual Math)      |  86.8%  |     —        |
        | Humanity’s Last Exam         |  26.6%  |   18.8%      |
        +------------------------------+---------+--------------+

[1] https://storage.googleapis.com/model-cards/documents/gemini-…

0Likes

Post Author

gitroom

Posted May 6, 2025 at 4:02 pm

man that endless commenting seriously kills my flow – gotta say, even after all the prompts and hacks, still can't get these models to chill out. you think we'll ever get ai to stop overdoing it and actually fit real developer habits or is it always gonna be like this?

0Likes Log in to Reply
Post Author

arnaudsm

Posted May 6, 2025 at 4:10 pm

Be careful, this model is worse than 03-25 in 10 of the 12 benchmarks (!)

I bet they kept training on coding, made everything worse on the way, and tried to hide it under the rug because of the sunk costs.

0Likes Log in to Reply
Post Author

nashashmi

Posted May 6, 2025 at 4:18 pm

I keep hearing good things about Gemini online and offline. I wrote them off as terrible when they first launched and have not looked back since.

How are they now? Sufficiently good? Competent? Competitive? Or limited? My needs are very consumer oriented, not programming/api stuff.

0Likes Log in to Reply
Post Author

ramoz

Posted May 6, 2025 at 4:26 pm

Never sleep on Google.

0Likes Log in to Reply
Post Author

panarchy

Posted May 6, 2025 at 4:54 pm

Is it just me that finds that while Gemini 2.5 is able to generate a lot of code that the end results are usually lackluster compared to Claude and even ChatGPT? I also find it hard-headed and frequently does things in ways I explicitly told it not to. The massive context window is pretty great though and enables me to do things I can't with the others so it still gets used a lot.

0Likes Log in to Reply
Post Author

xyst

Posted May 6, 2025 at 4:55 pm

Proprietary junk beats DeepSeek by a mere 213 points?

Oof. G and others are way behind

0Likes Log in to Reply
Post Author

childintime

Posted May 6, 2025 at 4:57 pm

How does it perform on anything but Python and Javascript? In my experience my milage varied a lot when using C#, for example, or Zig, so I've learnt to just let it select the language it wants.

Also, why doesn't Ctrl+C work??

0Likes Log in to Reply
Post Author

obsolete_wagie

Posted May 6, 2025 at 5:14 pm

o3 is so far ahead of antrhopic and google, these models arent even worth using

0Likes Log in to Reply
Post Author

ionwake

Posted May 6, 2025 at 5:16 pm

Can someone tell me if windsurf is better than cursor? ( pref someone who has used both for a few days? )

0Likes Log in to Reply
Post Author

m_kos

Posted May 6, 2025 at 5:33 pm

[Tangent] Anyone here using 2.5 Pro in Gemini Advanced? I have been experiencing a ton of bugs, e.g.,:

– [codes] showing up instead of references,

– raw search tool output sliding across the screen,

– Gemini continusly answering questions asked two or more messages before but ignoring the most recent one (you need to ask Gemini an unrelated question for it to snap out of this bug for a few minutes),

– weird messages including text irrelevant to any of my chats with Gemini, like baseball,

– confusing its own replies with mine,

– not being able to run its own Python code due to some unsolvable formatting issue,

– timeouts, and more.

0Likes Log in to Reply
Post Author

paulirish

Posted May 6, 2025 at 5:54 pm

> Gemini 2.5 Pro now ranks #1 on the WebDev Arena leaderboard

It'd make sense to rename WebDev Arena to React/Tailwind Arena. Its system prompt requires [1] those technologies and the entire tool breaks when requesting vanilla JS or other frameworks. The second-order implications of models competing on this narrow definition of webdev are rather troublesome.

[1] https://blog.lmarena.ai/blog/2025/webdev-arena/#:~:text=PROM…

0Likes Log in to Reply
Post Author

qwertox

Posted May 6, 2025 at 6:29 pm

I have my issues with the code Gemini Pro in AI Studio generates without customized "System Instructions".

It turns a well readable code-snipped of 5 lines into a 30 line snippet full of comments and mostly unnecessary error handling. Code which becomes harder to reason about.

But for Sysadmin tasks, like dealing with ZFS and LVM, it is absolutely incredible.

0Likes Log in to Reply

Gemini 2.5 Pro Preview by meetpateltech

Gemini 2.5 Pro Preview by meetpateltech

Share This Article

Newsletter

Best-in-class frontend web development

36 Comments

Leave a comment Cancel reply

Editor's Choice

Sign Up to Our Newsletter