Doug O’Laughlin is the author of Fabricated Knowledge and has been writing about the interaction between semiconductors and the AI revolution for years. In this interview, we focus on Nvidia — how it rose to prominence, its importance to the large language model revolution, and the corporate and policy implications of its trillion-dollar valuation. Do note we recorded this episode before the latest reporting around possible enhanced chip controls and cloud compute restrictions.
In the conversation below, we cover:
-
Nvidia’s origin in the graphics card industry, and CEO Jensen Huang’s creation of a GPU ecosystem, which set up Nvidia to become a dominant player;
-
How the rise of transformer models in AI benefited from Nvidia’s compute and software ecosystem, allowing for larger, more scalable models;
-
The absence (for now) of foreign and domestic competitors for Nvidia, especially in China;
-
What US export controls on Chinese hardware mean for US-China AI competition; and
-
The limits and opportunities that accompany China’s potential access to foreign cloud services.
Jordan Schneider: Let’s start with Jensen Huang, who was born in Taiwan, moved to the US in 1967 at the age of four, and later decides he wants to do computer graphics. Take us from there. Doug, what are the deep origins of Nvidia?
Doug O’Laughlin: Nvidia was truly a fly-by-night thing. They knew they wanted to do graphics cards. There were a few other competitors. The company’s first name had its origin in “NV” or “next version,” the name they gave to all their files. At some point, they had to incorporate and said, “We’re going to do ‘Nvidia’ for the Latin word for ‘envy.’”
They were always just focused on the next chips. The first chip they made is NV1 in 1995. That chip was just a low-level card for the graphics market. This was when the industry was starting to add graphical user interfaces to computers. They partnered with what is now STMicroelectronics to launch their first chip. It was okay. Then they launched their second chip, which was a little better. They skirted bankruptcy.
At this point, they’re duking it out with Silicon Graphics, 3dfx Interactive, and S3 Graphics. There are a lot of other companies, but they don’t matter because Nvidia and ATI Technologies — which later is acquired by AMD — are the only two companies that make it through this intense period. There were tons of graphics cards, and Nvidia was the winner of them all.
Around the period of 2000 to 2002, Nvidia becomes the stalwart. It has an amazing series of products and takes a lot of share, usually at the high end of the market.
That is the story of the beginning of Nvidia. It went from a tiny, fledgling, fab-less chip company to making new products and eventually winning its place market share. They became dominant and have held that position in gaming ever since.
Jordan Schneider: Nvidia is the king of computer gaming. But that wasn’t enough for the company’s leadership, it seems. How did they take the firm to the next level?
Doug O’Laughlin: Jensen has always been very vocal about accelerated compute. There’s an important shift here. I want to explain the difference between parallel computing and the rest of computing. X86 is one of the CPUs that you’re familiar with. It fetches instructions, does a job, and puts it back. They do that very quickly. GPUs, however, are specifically meant for rendering every single pixel on your screen. Each pixel, color, and location is a parallel problem.
Let’s take 1080p. There are thousands of pixels, and each pixel needs to know how to move and how to change. You can’t just do this with the CPU because it would try to calculate each individual pixel. You need a machine that is extremely wide and parallel so that it can do all these little computations at the same time. That’s how you get a graphic user interface.
Those three pixels, shaders, or calculating triangles are best done by matrix multiplication, which is important for AI. The type of calculation that GPUs were meant for — the graphics processing for the highly parallel calculation of all the pixels — ends up being almost a perfect use case for the primary, heaviest part of AI computing.
Jordan Schneider: Is it just happenstance that the GPUs that render Tribes II are the same ones the deep learning revolution requires? Or is something more fundamental going on?
Doug O’Laughlin: I would say it’s a mix of both. The type of processor ends up being well-suited for gaming. This market has a need that Nvidia can fulfill in the near term, and it can make money the entire way. But Jensen definitely, clearly had his eye on the ball.
He was talking in the 2010s about accelerated computing — about how all the workloads of the world needed to be sped up. Every year he would talk about it. Everyone was like, “Oh, that’s pretty cool.” But every year we would never really see it happen. But Jensen, the entire time, was giving away the ecosystem.
Remember, code is not used to running on a graphics card. It has to be split into small pieces and then fed into the machine parallel.
There’s something called CUDA, which is software that makes code more parallel so you can put it into the GPU.
He started giving it away for free as much as he could to all the researchers by maybe as early as 2010. He would just give away GPUs and CUDA and make sure all the researchers were working and using GPUs. That way, they would only know how to do their problems on GPUs while optimizing their physics libraries on GPUs.
Jensen had his eye on the ball and knew he was creating an ecosystem and making his product the one to use. He gives it away for free so everyone knows how to use it. Then everyone uses it in their workflows and optimizes around it.
He does this for about a decade. Jensen the whole time looks at these problems and knows these super-massive multiplication problems are the future of big data.
I don’t think that would have been a spicy opinion in the 2010s — that matrix multiplication would be used for very large data sets and hard, complicated problems; that’s not a big leap. But pursuing that path, seeing that vision, and creating the ecosystem around it — giving away a lot of it for free — is how Jensen locked in that ecosystem ten years ago.
Jordan Schneider: How does the revolution in transformers connect to the compute and software ecosystem Nvidia built?
Doug O’Laughlin: Transformers are a specific type of AI model. Each transformer takes a lot of data — say, a phrase or sentence used by a large language model — and puts all the information each word has into a transformer cell.
Transformers don’t perform as well as neural networks do at a small size. (Recurrent neural networks and convolutional neural networks are two examples, and there are other types of neural networks.)
These transformers are the tiny, sin