
Gemini 2.5 Pro Preview by meetpateltech
Senior Product Manager
Gemini API and Google AI Studio
We’ve seen developers doing amazing things with Gemini 2.5 Pro, so we decided to release an updated version a couple of weeks early to get into developers hands sooner.
Today we’re excited to release Gemini 2.5 Pro Preview (I/O edition). This update features even stronger coding capabilities, for you to start building with before Google I/O later this month. Expect meaningful improvements for front-end and UI development, alongside improvements in fundamental coding tasks such as transforming and editing code, and creating sophisticated agentic workflows.
“We found Gemini 2.5 Pro to be the best frontier model when it comes to “capability over latency” ratio. I look forward to rolling it out on Replit Agent whenever a latency-sensitive task needs to be accomplished with a high degree of reliability.”
– Michele Catasta, President, Replit
Best-in-class frontend web development
Gemini 2.5 Pro now ranks #1 on the WebDev Arena leaderboard, which measures human preference for a model’s ability to build aesthetically pleasing and functional web apps. Drawing on this leading capability, Gemini 2.5 Pro powers Cursor’s innovative code agent and empowers our collaborations with companies like Cognition and Replit. Together, we’re pushing the frontiers of agentic programming to unlock new possibilities for developers.
“The updated Gemini 2.5 Pro achieves leading perfor
36 Comments
jeswin
Now if there was a way to add prepaid credits and monitor usage near real-time on a dashboard, like every other vendor. Hey Google are you listening?
ramesh31
>Best-in-class frontend web development
It really is wild to have seen this happen over the last year. The days of traditional "design-to-code" FE work are completely over. I haven't written a line of HTML/CSS in months. If you are still doing this stuff by hand, you need to adapt fast. In conjunction with an agentic coding IDE and a few MCP tools, weeks worth of UI work are now done in hours to a higher level of quality and consistency with practically zero effort.
siwakotisaurav
Usually don’t believe the benchmarks but first in web dev arena specifically is crazy. That one has been Claude for so long, which tracks in my experience
ranyume
I don't know if I'm doing something wrong, but every time I ask gemini 2.5 for code it outputs SO MANY comments. An exaggerated amount of comments. Sections comments, step comments, block comments, inline comments, all the gang.
segphault
My frustration with using these models for programming in the past has largely been around their tendency to hallucinate APIs that simply don't exist. The Gemini 2.5 models, both pro and flash, seem significantly less susceptible to this than any other model I've tried.
There are still significant limitations, no amount of prompting will get current models to approach abstraction and architecture the way a person does. But I'm finding that these Gemini models are finally able to replace searches and stackoverflow for a lot of my day-to-day programming.
xnx
This is much bigger news than OpenAI's acquisition of WindSurf.
herpdyderp
I agree it's very good but the UI is still usually an unusable, scroll-jacking disaster. I've found it's best to let a chat sit for around a few minutes after it has finished printing the AI's output. Finding the `ms-code-block` element in dev tools and logging `$0.textContext` is reliable too.
crat3r
So, are people using these tools without the org they work for knowing? The amount of hoops I would have to jump through to get either of the smaller companies I have worked for since the AI boom to let me use a tool like this would make it absolutely not worth the effort.
I'm assuming large companies are mandating it, but ultimately the work that these LLMs seem poised for would benefit smaller companies most and I don't think they can really afford using them? Are people here paying for a personal subscription and then linking it to their work machines?
ionwake
Is it possible to sue this with Cursor? If so what is the name of the model? gemini-2.5-pro-preview ?
edit> Its gemini-2.5-pro-preview-05-06
edit>Cursor syas it doesnt have "good support" et, but im not sure if this is a defualt message when it doesnt recognise a model? is this a big deal? should I wait until its officially supported by cursor?
Just trying to save time here for everyone – anyone know the answer?
xbmcuser
As a non programmer Gemini 2.5 Pro I have been really loving this for my python scripting for manipulating text and excel files for web scraping. In the past I was able to use Chat Gpt to code some of the things that I wanted but with Gemini 2.5 Pro it has been just another level. If they improved it further that would be amazing
djrj477dhsnv
I don't understand what I'm doing wrong.. it seems like everyone is saying Gemini is better, but I've compared dozens of examples from my work, and Grok has always produced better results.
white_beach
object?
(aider joke)
llm_nerd
Their nomenclature is a bit confused. The Gemini web app has a 2.5 Pro (experimental), yet this apparently is referring to 2.5 Pro Preview 05-06.
Would be ideal if they incremented the version number or the like.
martinald
I'm totally lost again! If I use Gemini on the website (gemini.google.com), am I using 2.5 Pro IO edition, or am I using the old one?
oellegaard
Is there anything like Claude code for other models such as gemini?
mliker
The "video to learning app" feature is a cool concept (see it in AI Studio). I just passed in two separate Stanford lectures to see if it could come up with an interesting interactive app. The apps it generated weren't too useful, but I can see with more focus and development, it'd be a game changer for education.
brap
Gemini is now ranked #1 across every category in lmarena.
killerstorm
Why can't they just use version numbers instead of this "new preview" stuff?
E.g. call it Gemini Pro 2.5.1.
andy12_
Interestingly, when compering benchmarks of Experimental 03-25 [1] and Experimental 05-06 [2] it seems the new version scores slightly lower in everything except on LiveCodeBench.
[1] https://storage.googleapis.com/model-cards/documents/gemini-…
[2] https://deepmind.google/technologies/gemini/
thevillagechief
I've been switching between this and GPT-4o at work, and Gemini is really verbose. But I've been primarily using it. I'm confused though, the model available in copilot says Gemini 2.5 Pro (Preview), and I've had it for a few weeks. This was just released today. Is this an updated preview? If so, the blog/naming is confusing.
CSMastermind
Hasn't Gemini 2.5 Pro been out for a while?
At first I was very impressed with it's coding abilities, switching off of Claud for it but recently I've been using GPT o3 which I find is much more concise and generally better at problem solving when you hit an error.
laborcontract
My guess is that they've done a lot of tuning to improve diff based code editing. Gemini 2.5 is fantastic at agentic work, but it still is pretty rough around the edges in terms of generating perfectly matching diffs to edit code. It's probably one of the very few issues with the model. Luckily, aider tracks this.
They measure the old gemini 2.5 generating proper diffs 92% of the time. I bet this goes up to ~95-98% https://aider.chat/docs/leaderboards/
Question for the google peeps who monitor these threads: Is gemini-2.5-pro-exp (free tier) updated as well, or will it go away?
Also, in the blog post, it says:
Does this mean gemini-2.5-pro-preview-03-25 now uses 05-06? Does the same apply to gemini-2.5-pro-exp-03-25?
update: I just tried updating the date in the exp model (gemini-2.5-pro-exp-05-06) and that doesnt work.
EliasWatson
I wonder how the latest version of Grok 3 would stack up to Gemini 2.5 Pro on the web dev arena leaderboard. They are still just showing the original early access model for some reason, despite there being API access to the latest model. I've been using Grok 3 with Aider Chat and have been very impressed with it. I get $150 of free API credits every month by allowing them to train on my data, which I'm fine with since I'm just working on personal side projects. Gemini 2.5 Pro and Claude 3.7 might be a little better than Grok 3, but I can't justify the cost when Grok doesn't cost me a penny to use.
mohsen1
I use Gemini for almost everything. But their model card[1] only compares to o3-mini! In known benchmarks o3 is still ahead:
[1] https://storage.googleapis.com/model-cards/documents/gemini-…
gitroom
man that endless commenting seriously kills my flow – gotta say, even after all the prompts and hacks, still can't get these models to chill out. you think we'll ever get ai to stop overdoing it and actually fit real developer habits or is it always gonna be like this?
arnaudsm
Be careful, this model is worse than 03-25 in 10 of the 12 benchmarks (!)
I bet they kept training on coding, made everything worse on the way, and tried to hide it under the rug because of the sunk costs.
nashashmi
I keep hearing good things about Gemini online and offline. I wrote them off as terrible when they first launched and have not looked back since.
How are they now? Sufficiently good? Competent? Competitive? Or limited? My needs are very consumer oriented, not programming/api stuff.
ramoz
Never sleep on Google.
panarchy
Is it just me that finds that while Gemini 2.5 is able to generate a lot of code that the end results are usually lackluster compared to Claude and even ChatGPT? I also find it hard-headed and frequently does things in ways I explicitly told it not to. The massive context window is pretty great though and enables me to do things I can't with the others so it still gets used a lot.
xyst
Proprietary junk beats DeepSeek by a mere 213 points?
Oof. G and others are way behind
childintime
How does it perform on anything but Python and Javascript? In my experience my milage varied a lot when using C#, for example, or Zig, so I've learnt to just let it select the language it wants.
Also, why doesn't Ctrl+C work??
obsolete_wagie
o3 is so far ahead of antrhopic and google, these models arent even worth using
ionwake
Can someone tell me if windsurf is better than cursor? ( pref someone who has used both for a few days? )
m_kos
[Tangent] Anyone here using 2.5 Pro in Gemini Advanced? I have been experiencing a ton of bugs, e.g.,:
– [codes] showing up instead of references,
– raw search tool output sliding across the screen,
– Gemini continusly answering questions asked two or more messages before but ignoring the most recent one (you need to ask Gemini an unrelated question for it to snap out of this bug for a few minutes),
– weird messages including text irrelevant to any of my chats with Gemini, like baseball,
– confusing its own replies with mine,
– not being able to run its own Python code due to some unsolvable formatting issue,
– timeouts, and more.
paulirish
> Gemini 2.5 Pro now ranks #1 on the WebDev Arena leaderboard
It'd make sense to rename WebDev Arena to React/Tailwind Arena. Its system prompt requires [1] those technologies and the entire tool breaks when requesting vanilla JS or other frameworks. The second-order implications of models competing on this narrow definition of webdev are rather troublesome.
[1] https://blog.lmarena.ai/blog/2025/webdev-arena/#:~:text=PROM…
qwertox
I have my issues with the code Gemini Pro in AI Studio generates without customized "System Instructions".
It turns a well readable code-snipped of 5 lines into a 30 line snippet full of comments and mostly unnecessary error handling. Code which becomes harder to reason about.
But for Sysadmin tasks, like dealing with ZFS and LVM, it is absolutely incredible.