
Google Gemini has the worst LLM API by indigodaddy
Google is on the frontier with recent Gemini releases. They now have:
- A competitive coding & reasoning model (Gemini-2.5-pro)
- The longest-context frontier models (1 or 2M context – OAI just achieved parity since GPT-4.1 has a 1M context too)
- The best frontier audio, video & document multimodal models. (Some others support audio but at a much higher cost)
- The best long-context fine-tuning offering (Gemini-2.0-flash can be tuned at 131k, others top out earlier)
- The best multimodal fine-tuning offering (Gemini-2.0-flash can be tuned with audio, image & documents)
And yet these models hide behind the poorest developer experience in the market.
1. The Gemini API is available in two places, with differing functionality
You can use Gemini via either Vertex AI, or Google AI Studio. Google Developer Advocates will tell you: if you’re a startup or hobbyist, you should use Google AI Studio, as Vertex AI is more enterprise-focused.
That makes sense: it parallels the distinction between the OpenAI API and the Azure OpenAI API. Or Anthropic API vs Amazon Bedrock Claude.
It’s not unusual to offer an enterprise API with stronger security or compliance guarantees, but slower feature roll-out.
But unfortunately that comparison doesn’t hold. Functionality is released to AI Studio & Vertex AI at different times, and some functionality never comes to AI Studio.
This often becomes a dealbreaker and you end up needing to juggle both options as a startup.
2. The documentation is bad
There are two documentation sites for AI Studio vs Vertex. You will land on the wrong one.
This gets worse due to the non-equivalent functionality. For example, AI Studio’s docs may lead you to believe that Google only supports fine-tuning for Gemini 1.5. But, Vertex AI supports fine-tuning for Gemini 2.0, it’s just that AI Studio does not.
In general, the docs are worse than most. Much of the AI Studio docs still refer to Gemini 1.5, which is now deprecated & make it generally unclear whether Gemini 2.5 still supports the same features.
The API is also the quirkiest around. For example, there are “safety settings” that reject certain requests by default, but can be disabled. There is an OpenAI-compatible SDK for Google AI Studio (not for Vertex AI), but it doesn’t support multimodality. You should probably avoid it.
3. The Vertex AI SDK doesn’t support API key authentication
Most LLM providers use be
16 Comments
jauntywundrkind
In general, it's just wild to see Google squander such an intense lead.
In 2012, Google was far ahead of the world in making the vast majority of their offerings intensely API-first, intensely API accessible.
It all changed in such a tectonic shift. The Google Plus/Google+ era was this weird new reality where everything Google did had to feed into this social network. But there was nearly no API available to anyone else (short of some very simple posting APIs), where Google flipped a bit, where the whole company stopped caring about the rest of the world and APIs and grew intensely focused on internal use, on themselves, looked only within.
I don't know enough about the LLM situation to comment, but Google squandering such a huge lead, so clearly stopping caring about the world & intertwingularity, becoming so intensely internally focused was such a clear clear clear fall. There's the Google Graveyard of products, but the loss in my mind is more clearly that Google gave up on APIs long ago, and has never performed any clear acts of repentance for such a grevious mis-step against the open world, open possibilities, against closed & internal focus.
simonw
I still don't really understand what Vertex AI is.
If you can ignore Vertex most of the complaints here are solved – the non-Vertex APIs have easy to use API keys, a great debugging tool (https://aistudio.google.com), a well documented HTTP API and good client libraries too.
I actually use their HTTP API directly (with the ijson streaming JSON parser for Python) and the code is reasonably straight-forward: https://github.com/simonw/llm-gemini/blob/61a97766ff0873936a…
You have to be very careful when searching (using Google, haha) that you don't accidentally end up in the Vertext documentation though.
Worth noting that Gemini does now have an OpenAI-compatible API endpoint which makes it very easy to switch apps that use an OpenAI client library over to backing against Gemini instead: https://ai.google.dev/gemini-api/docs/openai
Anthropic have the same feature now as well: https://docs.anthropic.com/en/api/openai-sdk
ryao
I have not pushed my local commits to GitHub lately (and probably should), but my experience with the Gemini API so far has been relatively positive:
https://github.com/ryao/gemini-chat
The main thing I do not like is that token counting is rated limited. My local offline copies have stripped out the token counting since I found that the service becomes unusable if you get anywhere near the token limits, so there is no point in trimming the history to make it fit. Another thing I found is that I prefer to use the REST API directly rather than their Python wrapper.
Also, that comment about 500 errors is obsolete. I will fix it when I do new pushes.
SmellTheGlove
Google’s APIs are all kind of challenging to ramp up on. I’m not sure if it’s the API itself or the docs just feeling really fragmented. It’s hard to find what you’re looking for even if you use their own search engine.
asadm
I don't get the outrage. Just use their OpenAI endpoints: https://ai.google.dev/gemini-api/docs/openai
It's the best model out there.
behnamoh
Even their OAI-compatible API isn't fully compatible. Tools like Instructor have special-casing for Gemini…
lemming
Additionally, there's no OpenAPI spec, so you have to generate one from their protobuf specs if you want to use that to generate a client model. Their protobuf specs live in a repo at https://github.com/googleapis/googleapis/tree/master/google/…. Now you might think that v1 would be the latest there, but you would be wrong – everyone uses v1beta (not v1, not v1alpha, not v1beta3) for reasons that are completely unclear. Additionally, this repo is frequently not up to date with the actual API (it took them ages to get the new thinking config added, for example, and their usage fields were out of date for the longest time). It's really frustrating.
rafram
Site seems to be down – I can’t get the article to load – but by far the most maddening part of Vertex AI is the way it deals with multimodal inputs. You can’t just attach an image to your request. You have to use their file manager to upload the file, then make sure it gets deleted once you’re done.
That would all still be OK-ish except that their JS library only accepts a local path, which it then attempts to read using the Node `fs` API. Serverless? Better figure out how to shim `fs`!
It would be trivial to accept standard JS buffers. But it’s not clear that anyone at Google cares enough about this crappy API to fix it.
bionhoward
Also has the same customer noncompete copy pasted from ClosedAI. Not that anyone seemingly cares about the risk of lawsuits from Google for using Gemini in a way that happens to compete with random-Gemini-tentacle-123
tom_m
Doesn't matter much, Google already won the AI race. They had all the eyeballs already. There's a huge reason why they are getting slapped with anti-trust right now. The other companies aren't happy.
I agree though, their marketing and product positioning is super confusing and weird. They are running their AI business in a very very very strange way. This has created a delay, I don't think opportunity for others, in their dominance in this space.
Using Gemini inside BigQuery (this is via Vertex) is such a stupid good solution. Along with all of the other products that support BigQuery (datastream from cloudsql MySQL/postgres, dataform for query aggregation and transformation jobs, BigQuery functions, etc.), there's an absolutely insane amount of power to bring data over to Gemini and back out.
It's literally impossible for OpenAI to compete because Google has all of the other ingredients here already and again, the user base.
I'm surprised AWS didn't come out stronger here, weird.
fumeux_fume
I’m sorry have you used Azure? I’ve worked with all the major cloud providers and Google has its warts, but pales in comparison to the hoops Azure make you jump through to make a simple API call.
simianwords
Am I the only one who prefers a more serious approach to prefix caching? It is a powerful tool and having an endpoint dedicated to it and being able to control TTL's using parameters seems like the best approach.
On the other hand the first two approaches from OpenAI and Anthropic are frankly bad. Automatically detecting what should be prefix cached? Yuck! And I can't even set my own TTL's in Anthropic API (feel free to correct me – a quick search revealed this).
Serious features require serious approaches.
chrisheecho
Hey there, I’m Chris Cho (x: chrischo_pm, Vertex PM focusing on DevEx) and Ivan Nardini (x: ivnardini, DevRel). We heard you and let us answer your questions directly as possible.
First of all, thank you for your sentiment for our latest 2.5 Gemini model. We are so glad that you find the models useful! We really appreciate this thread and everyone for the feedback on Gemini/Vertex
We read through all your comments. And YES, – clearly, we've got some friction in the DevEx. This stuff is super valuable, helps me to prioritize. Our goal is to listen, gather your insights, offer clarity, and point to potential solutions or workarounds.
I’m going to respond to some of the comments given here directly on the thread
franze
yeah, also grounding with Google in Google 2.5 Pro does not
… deliver any URLs back, just the domains from where it grounded it response
it should return vertexai urls that redirect to the sources, but doesn't do it in all cases (in non of mine) according to the docs
plus you mandatory need to display an HTML fragment with search links that you are not allowed to edit
basically a corporate infight as an API
Havoc
Definitely designed by multiple teams with no coordination.
The very generous free tier is pretty much the only reason I'm using it at all
mattw1810
Their patchy JSON schema support for tool calls & structured generation is also very annoying… things like unions that you’d think are table stakes (and in fact work fine with both OpenAI and Anthropic) get rejected & you have to go reengineer your entire setup to accommodate it.