
The Future of Compute: Nvidia’s Crown Is Slipping by wilson090
No one has benefitted from the scaling hypothesis quite like NVIDIA. On the back of an AI boom and GPU monopoly, they’ve become the fastest scaling hardware company in history–adding $2T of value in 13 months with SaaS-like margins.
While the H100 generation likely represents peak pricing power (new B200s have lower margins and higher COGS), an immediate lack of alternatives means they’ll continue to print cash.
The open question is long-term (>6yrs) durability. Hyperscalers (Google, Microsoft, Amazon, and Meta) are aggressively consolidating AI demand to become the dominant consumers of AI accelerators; while developing competitive, highly-credible chip efforts.
Simultaneously, the sheer scale of compute needs has hit limits on capex, power availability, and infrastructure development. This is driving an enormous shift towards distributed, vertically-integrated, and co-optimized systems (chips, racks, networking, cooling, infrastructure software, power) that NVIDIA is ill-prepared to supply.
In this paradigm, NVIDIA can lose with the highest-performing GPUs; the implications will reverberate at every level of the AI stack–from fabs and semiconductors, to infrastructure, clouds, model developers, and the application layer.
Demand Consolidation
NVIDIAs predicament has been driven by hyperscalers’ consolidation of AI workloads and accelerator demand–setting the stage for toothier custom silicon and evolving infrastructure requirements.
Already, ~50% of NVIDIA’s datacenter demand is from hyperscalers, the other half comes from a large number of startups, enterprises, VCs, and national consortiums.
That share is set to shrink–the tidal wave of startup spending on GPUs was a transient phenomenon to secure access in a fiercely competitive market. Today, most startups simply don’t have unusual control or infrastructure requirements and are better served by the cloud.
As such, early purchases were ill-fated; this is borne out by low utilization and abysmal ROI’s for small/short term GPU rentals (usually offered by startups that over-provisioned and are forced to rent out at a loss). This is eerily reminiscent of dot com era startups defending costly server hardware as the world moved to the cloud.
More fundamentally, and contrary to early expectations, the model rollout has aggressively consolidated around a few closed source APIs. Even open source and edge models are now the domain of the hyperscalers. Custom, small-midsize models trained with unique data for specific uses have struggled (e.g. Bloomberg GPT). Scaled frontier models can be cheaper while performing and generalizing better – especially with effective RAG and widely available fine-tuning. Thus, the value proposition for most companies training proprietary models is unclear. Going forward, demand from this long tail of buyers looks shaky; considerably consolidating NVIDIA’s revenue base.
Meanwhile, the smaller independent clouds (Coreweave, Lambda, Crusoe, Runpod etc) have very uncertain futures. NVIDIA propped these businesses up with direct investments and preferential GPU allocations in order to drive fragmentation and reduce their reliance on the hyperscalers. Yet, they face long term headwinds without the product variety, infrastructure, and talent to cross-sell and establish lock-in; forcing them to sell commoditized H100 hours. NVIDIA’s production ramp has eroded scarcity and attractive margins baked into initial assumptions, while undermining the “moat” of favorable allocation. These companies are also extremely leveraged on 3rd party demand, and have been relying on GPU-secured debt + heroic fundraising to expand fast enough to reach competitive economies of scale. It’s an open question if this will work, but things look ugly. The effects of lukewarm third party demand are already visible from the high availability and declining GPU/hour costs at small clouds.
Price cuts have reduced costs by 40%+ since last year and show no signs of stopping. This is disastrous for the durability and unit economics of independent clouds. Currently, you can rent GPUs for $1.99/hour. At those prices providers are getting <10% ROE; if prices dip below ~
11 Comments
alephnerd
Services! Services services services!
This is what will help protect Nvidia now that DC and cluster spend is cooling.
They own the ecosystem thanks to CUDA, Infiniband, NGC, NVLink, and other key tools. Now they should add additional applications (the AI Foundry is a good way to do that), or forays into adjacent spaces like white-labeled cluster management.
Working on building custom designs and consulting on custom GPU projects would be helpful as well by helping monetize their existing design practice during slower markets.
Of course, Nvidia is starting to do both, with Nvidia AI Foundry for the former and is working on the latter by starting a GPU architecture and design consulting as announced at GTC and under McKinney
ivape
Interesting, Marvell is actually down over 50% this year. I just don't understand the bear case at all. I'm a nobody and I'm still willing to buy a $1500 gpu, and that GPU still can't do what the cloud does. The next $1500 gpu probably can't either. It feels like we're over thinking this. The hardware roll-out is all there is imho. Jensen has mentioned he sees Nvidia being a 10 trillion-dollar company, and I'm willing to meet him half-way with my faith here.
Edit:
– I wonder what's stopping Nvida from releasing an AI phone
– A LLM competitor service (Hey, how about you guys make your own chips?)
– They are already releasing an AI PC
– Their own self driving cars
– Their own robots
If you mess with them, why won't they just compete with you?
Just wanted to say one more thing, that Warren Buffet famously said he regretted not investing in both Google and Apple. I think something like this is happening again, especially as there are lulls that the mainstream public perceives, but enthusiasts don't. To maintain the hyperbole, if you are not a full believer as a developer, then you are simply out of your mind.
ein0p
How can it be "slipping" if they sell out of all their stuff years in advance? I still can't find any sanely priced 5090s. And before you point out that 5090s are not their main revenue driver, they're sold out of H100s and so on years in advance, too.
echelon
> While the H100 generation likely represents peak pricing power (new B200s have lower margins and higher COGS), an immediate lack of alternatives means they’ll continue to print cash.
That's not a trend yet. We're about to enter an era where most media is generated. Demand is only going to go up, and margins may not matter if volume goes up.
> The open question is long-term (>6yrs) durability1. Hyperscalers (Google, Microsoft, Amazon, and Meta) are aggressively consolidating AI demand to become the dominant consumers of AI accelerators; while developing competitive, highly-credible chip efforts.
Hyperscalers aren't the only players building large GPU farms. There are large foundation model companies doing it too, and there are also new clouds that offer compute outside of the hyperscaler offerings (CoreWeave, Lambda, and dozens of others). Granted, these may be a drop in the bucket and hyperscalers may still win this trend.
01100011
Seems like another article based on the assumption that Nvidia just sits there doing nothing while everyone who has so far proven unable to compete suddenly figures it out and steals their lunch.
At some point one of these Nvidia doomers will be right but there is a long line of them who failed miserably.
Havoc
They’re basically going from a functional monopoly to having to compete.
Not ideal for them but hardly a death blow
mkoubaa
When takes like these go mainstream (Financial Times, etc) I buy.
lvl155
I am starting to think AMD is doing this on purpose and they have some secret handshake deal with Nvidia. Nvidia has at least two more years of “sellout at any price” market. Not because they have the best solution (which they do atm) but because they basically share the monopoly with Apple at TSMC. And Apple is content wasting that away on iPhones.
cornhole
nvidia's gpu drivers qa is definitely slipping
latchkey
Oct 2024
otistravel
The author completely underestimates NVIDIA's strategic position. They don't need to win the hardware game forever – they're building the entire AI stack: hardware, networking, software, models, developer tools. Nobody else is doing this comprehensively. While hyperscalers are making custom chips for their own use cases, NVIDIA is building a unified platform that everyone else will use. This isn't about who makes the best GPU, it's about who builds the ecosystem that becomes the industry standard.