
The Revolution at OpenAI by webmaven
Sam Altman doesn’t know where artificial intelligence will lead humanity. But he’s taking us there anyway.


On a Monday morning in April, Sam Altman sat inside OpenAI’s San Francisco headquarters, telling me about a dangerous artificial intelligence that his company had built but would never release. His employees, he later said, often lose sleep worrying about the AIs they might one day release without fully appreciating their dangers. With his heel perched on the edge of his swivel chair, he looked relaxed. The powerful AI that his company had released in November had captured the world’s imagination like nothing in tech’s recent history. There was grousing in some quarters about the things ChatGPT could not yet do well, and in others about the future it may portend, but Altman wasn’t sweating it; this was, for him, a moment of triumph.
Explore the September 2023 Issue
Check out more from this issue and find your next story to read.
In small doses, Altman’s large blue eyes emit a beam of earnest intellectual attention, and he seems to understand that, in large doses, their intensity might unsettle. In this case, he was willing to chance it: He wanted me to know that whatever AI’s ultimate risks turn out to be, he has zero regrets about letting ChatGPT loose into the world. To the contrary, he believes it was a great public service.
“We could have gone off and just built this in our building here for five more years,” he said, “and we would have had something jaw-dropping.” But the public wouldn’t have been able to prepare for the shock waves that followed, an outcome that he finds “deeply unpleasant to imagine.” Altman believes that people need time to reckon with the idea that we may soon share Earth with a powerful new intelligence, before it remakes everything from work to human relationships. ChatGPT was a way of serving notice.
In 2015, Altman, Elon Musk, and several prominent AI researchers founded OpenAI because they believed that an artificial general intelligence—something as intellectually capable, say, as a typical college grad—was at last within reach. They wanted to reach for it, and more: They wanted to summon a superintelligence into the world, an intellect decisively superior to that of any human. And whereas a big tech company might recklessly rush to get there first, for its own ends, they wanted to do it safely, “to benefit humanity as a whole.” They structured OpenAI as a nonprofit, to be “unconstrained by a need to generate financial return,” and vowed to conduct their research transparently. There would be no retreat to a top-secret lab in the New Mexico desert.
For years, the public didn’t hear much about OpenAI. When Altman became CEO in 2019, reportedly after a power struggle with Musk, it was barely a story. OpenAI published papers, including one that same year about a new AI. That got the full attention of the Silicon Valley tech community, but the technology’s potential was not apparent to the general public until last year, when people began to play with ChatGPT.
The engine that now powers ChatGPT is called GPT-4. Altman described it to me as an alien intelligence. Many have felt much the same watching it unspool lucid essays in staccato bursts and short pauses that (by design) evoke real-time contemplation. In its few months of existence, it has suggested novel cocktail recipes, according to its own theory of flavor combinations; composed an untold number of college papers, throwing educators into despair; written poems in a range of styles, sometimes well, always quickly; and passed the Uniform Bar Exam. It makes factual errors, but it will charmingly admit to being wrong. Altman can still remember where he was the first time he saw GPT-4 write complex computer code, an ability for which it was not explicitly designed. “It was like, ‘Here we are,’ ” he said.
Within nine weeks of ChatGPT’s release, it had reached an estimated 100 million monthly users, according to a UBS study, likely making it, at the time, the most rapidly adopted consumer product in history. Its success roused tech’s accelerationist id: Big investors and huge companies in the U.S. and China quickly diverted tens of billions of dollars into R&D modeled on OpenAI’s approach. Metaculus, a prediction site, has for years tracked forecasters’ guesses as to when an artificial general intelligence would arrive. Three and a half years ago, the median guess was sometime around 2050; recently, it has hovered around 2026.
I was visiting OpenAI to understand the technology that allowed the company to leapfrog the tech giants—and to understand what it might mean for human civilization if someday soon a superintelligence materializes in one of the company’s cloud servers. Ever since the computing revolution’s earliest hours, AI has been mythologized as a technology destined to bring about a profound rupture. Our culture has generated an entire imaginarium of AIs that end history in one way or another. Some are godlike beings that wipe away every tear, healing the sick and repairing our relationship with the Earth, before they usher in an eternity of frictionless abundance and beauty. Others reduce all but an elite few of us to gig serfs, or drive us to extinction.
From the June 2023 issue: Never give artificial intelligence the nuclear codes
Altman has entertained the most far-out scenarios. “When I was a younger adult,” he said, “I had this fear, anxiety … and, to be honest, 2 percent of excitement mixed in, too, that we were going to create this thing” that “was going to far surpass us,” and “it was going to go off, colonize the universe, and humans were going to be left to the solar system.”
“As a nature reserve?” I asked.
“Exactly,” he said. “And that now strikes me as so naive.”

Across several conversations in the United States and Asia, Altman laid out his new vision of the AI future in his excitable midwestern patter. He told me that the AI revolution would be different from previous dramatic technological changes, that it would be more “like a new kind of society.” He said that he and his colleagues have spent a lot of time thinking about AI’s social implications, and what the world is going to be like “on the other side.”
But the more we talked, the more indistinct that other side seemed. Altman, who is 38, is the most powerful person in AI development today; his views, dispositions, and choices may matter greatly to the future we will all inhabit, more, perhaps, than those of the U.S. president. But by his own admission, that future is uncertain and beset with serious dangers. Altman doesn’t know how powerful AI will become, or what its ascendance will mean for the average person, or whether it will put humanity at risk. I don’t hold that against him, exactly—I don’t think anyone knows where this is all going, except that we’re going there fast, whether or not we should be. Of that, Altman convinced me.

OpenAI’s headquarters are in a four-story former factory in the Mission District, beneath the fog-wreathed Sutro Tower. Enter its lobby from the street, and the first wall you encounter is covered by a mandala, a spiritual representation of the universe, fashioned from circuits, copper wire, and other materials of computation. To the left, a secure door leads into an open-plan maze of handsome blond woods, elegant tile work, and other hallmarks of billionaire chic. Plants are ubiquitous, including hanging ferns and an impressive collection of extra-large bonsai, each the size of a crouched gorilla. The office was packed every day that I was there, and unsurprisingly, I didn’t see anyone who looked older than 50. Apart from a two-story library complete with sliding ladder, the space didn’t look much like a research laboratory, because the thing being built exists only in the cloud, at least for now. It looked more like the world’s most expensive West Elm.
One morning I met with Ilya Sutskever, OpenAI’s chief scientist. Sutskever, who is 37, has the affect of a mystic, sometimes to a fault: Last year he caused a small brouhaha by claiming that GPT-4 may be “slightly conscious.” He first made his name as a star student of Geoffrey Hinton, the University of Toronto professor emeritus who resigned from Google this spring so that he could speak more freely about AI’s danger to humanity.
Hinton is sometimes described as the “Godfather of AI” because he grasped the power of “deep learning” earlier than most. In the 1980s, shortly after Hinton completed his Ph.D., the field’s progress had all but come to a halt. Senior researchers were still coding top-down AI systems: AIs would be programmed with an exhaustive set of interlocking rules—about language, or the principles of geology or of medical diagnosis—in the hope that someday this approach would add up to human-level cognition. Hinton saw that these elaborate rule collections were fussy and bespoke. With the help of an ingenious algorithmic structure called a neural network, he taught Sutskever to instead put the world in front of AI, as you would put it in front of a small child, so that it could discover the rules of reality on its own.
Sutskever described a neural network to me as beautiful and brainlike. At one point, he rose from the table where we were sitting, approached a whiteboard, and uncapped a red marker. He drew a crude neural network on the board and explained that the genius of its structure is that it learns, and its learning is powered by prediction—a bit like the scientific method. The neurons sit in layers. An input layer receives a chunk of data, a bit of text or an image, for example. The magic happens in the middle—or “hidden”—layers, which process the chunk of data, so that the output layer can spit out its prediction.
Imagine a neural network that has been programmed to predict the next word in a text. It will be preloaded with a gigantic number of possible words. But before it’s trained, it won’t yet have any experience in distinguishing among them, and so its predictions will be shoddy. If it is fed the sentence “The day after Wednesday is …” its initial output might be “purple.” A neural network learns because its training data include the correct predictions, which means it can grade its own outputs. When it sees the gulf between its answer, “purple,” and the correct answer, “Thursday,” it adjusts the connections among words in its hidden layers accordingly. Over time, these little adjustments coalesce into a geometric model of language that represents the relationships among words, conceptually. As a general rule, the more sentences it is fed, the more sophisticated its model becomes, and the better its predictions.
That’s not to say that the path from the first neural networks to GPT-4’s glimmers of humanlike intelligence was easy. Altman has compared early-stage AI research to teaching a human baby. “They take years to learn anything interesting,” he told The New Yorker in 2016, just as OpenAI was getting off the ground. “If A.I. researchers were developing an algorithm and stumbled across the one for a human baby, they’d get bored watching it, decide it wasn’t working, and shut it down.” The first few years at OpenAI were a slog, in part because no one there knew whether they were training a baby or pursuing a spectacularly expensive dead end.
“Nothing was working, and Google had everything: all the talent, all the people, all the money,” Altman told me. The founders had put up millions of dollars to start the company, and failure seemed like a real possibility. Greg Brockman, the 35-year-old president, told me that in 2017, he was so discouraged that he started lifting weights as a compensatory measure. He wasn’t sure that OpenAI was going to survive the year, he said, and he wanted “to have something to show for my time.”
Neural networks were already doing intelligent things, but it wasn’t clear which of them might lead to general intelligence. Just after OpenAI was founded, an AI called AlphaGo had stunned the world by beating Lee Se-dol at Go, a game substantially more complicated than chess. Lee, the vanquished world champion, described AlphaGo’s moves as “beautiful” and “creative.” Another top player said that they could never have been conceived by a human. OpenAI tried training an AI on Dota 2, a more complicated game still, involving multifront fantastical warfare in a three-dimensional patchwork of forests, fields, and forts. It eventually beat the best human players, but its intelligence never translated to other settings. Sutskever and his colleagues were like disappointed parents who had allowed their kids to play video games for thousands of hours against their better judgment.
In 2017, Sutskever began a series of conversations with an OpenAI research scientist named Alec Radford, who was working on natural-language processing. Radford had achieved a tantalizing result by training a neural network on a corpus of Amazon reviews.
The inner workings of ChatGPT—all of those mysterious things that happen in GPT-4’s hidden layers—are too complex for any human to understand, at least with current tools. Tracking what’s happening across the model—almost certainly composed of billions of neurons—is, today, hopeless. But Radford’s model was simple enough to allow for understanding. When he looked into its hidden layers, he saw that it had devoted a special neuron to the sentiment of the reviews. Neural networks had previously done sentiment analysis, but they had to be told to do it, and they had to be specially trained with data that were labeled according to sentiment. This one had developed the capability on its own.
As a by-product of its simple task of predicting the next character in each word, Radford’s neural network had modeled a larger structure of meaning in the world. Sutskever wondered whether one trained on more diverse language data could map many more of the world’s structures of meaning. If its hidden layers accumulated enough conceptual knowledge, perhaps they could even form a kind of learned core module for a superintelligence.
It’s worth pausing to understand why language is such a special information source. Suppose you are a fresh intelligence that pops into existence here on Earth. Surrounding you is the planet’s atmosphere, the sun and Milky Way, and hundreds of billions of other galaxies, each one sloughing off light waves, sound vibrations, and all manner of other information. Language is different from these data sources. It isn’t a direct physical signal like light or sound. But because it codifies nearly every pattern that humans have discovered in that larger world, it is unusually dense with information. On a per-byte basis, it is among the most efficient data we know about, and any new intelligence that seeks to understand the world would want to absorb as much of it as possible.
From the June 2018 issue: Henry A. Kissinger on how human society is unprepared for the rise of AI
Sutskever told Radford to think bigger than Amazon reviews. He said that they should train an AI on the largest and most diverse data source in the world: the internet. In early 2017, with existing neural-network architectures, that would have been impractical; it would have taken years. But in June of that year, Sutskever’s ex-colleagues at Google Brain published a working paper about a new neural-network architecture called the transformer. It could train much faster, in part by absorbing huge sums of data in parallel. “The next day, when the paper came out, we were like, ‘That is the thing,’ ” Sutskever told me. “ ‘It gives us everything we want.’ ”

One year later, in June 2018, OpenAI released GPT, a transformer model trained on more than 7,000 books. GPT didn’t start with a basic book like See Spot Run and work its way up to Proust. It didn’t even read books straight through. It absorbed random chunks of them simultaneously. Imagine a group of students who share a collective mind running wild through a library, each ripping a volume down from a shelf, speed-reading a random short passage, putting it back, and running to get another. They would predict word after word as they went, sharpening their collective mind’s linguistic instincts, until at last, weeks later, they’d taken in every book.
GPT discovered many patterns in all those passages it read. You could tell it to finish a sentence. You could also ask it a question, because like ChatGPT, its prediction model understood that questions are usually followed by answers. Still, it was janky, more proof of concept than harbinger of a superintelligence. Four months later, Google released BERT, a suppler language model that got better press. But by then, OpenAI was already training a new model on a data set of more than 8 million webpages, each of which had cleared a minimum threshold of upvotes on Reddit—not the strictest filter, but perhaps better than no filter at all.
Sutskever wasn’t sure how powerful GPT-2 would be after ingesting a body of text that would take a human reader centuries to absorb. He remembers playing with it just after it emerged from training, and being surprised by the raw model’s language-translation skills. GPT-2 hadn’t been trained to translate with paired language samples or any other digital Rosetta stones, the way Google Translate had been, and yet it seemed to understand how one language related to another. The AI had developed an emergent ability unimagined by its creators.

Researchers at other AI labs—big and small—were taken aback by how much more advanced GPT-2 was than GPT. Google, Meta, and others quickly began to train larger language models. Altman, a St. Louis native, Stanford dropout, and serial entrepreneur, had previously led Silicon Valley’s preeminent start-up accelerator, Y Combinator; he’d seen plenty of young companies with a good idea get crushed by incumbents. To raise capital, OpenAI added a for-profit arm, which now comprises more than 99 percent of the organization’s head count. (Musk, who had by then left the company’s board, has compared this move to turning a rainforest-conservation organization into a lumber outfit.) Microsoft invested $1 billion soon after, and has reportedly invested another $12 billion since. OpenAI said that initial investors’ returns would be capped at 100 times the value of the original investment—with any overages going to education or other initiatives intended to benefit humanity—but the company would not confirm Microsoft’s cap.
Altman and OpenAI’s other leaders seemed confident that the restructuring would not interfere with the company’s mission, and indeed would only accelerate its completion. Altman tends to take a rosy view of these matters. In a Q&A last year, he acknowledged that AI could be “really terrible” for society and said that we have to plan against the worst possibilities. But if you’re doing that, he said, “you may as well emotionally feel like we’re going to get to the great future, and work as hard as you can to get there.”
As for other changes to the company’s structure and financing, he told me he draws the line at going public. “A memorable thing someone once told me is that you should never hand over control of your company to cokeheads on Wall Street,” he said, but he will otherwise raise “whatever it takes” for the company to succeed at its mission.
Whether or not OpenAI ever feels the pressure of a quarterly earnings report, the company now finds itself in a race against tech’s largest, most powerful conglomerates to train models of increasing scale and sophistication—and to commercialize them for their investors. Earlier this year, Musk founded an AI lab of his own—xAI—to compete with OpenAI. (“Elon is a super-sharp dude,” Altman said diplomatically when I asked him about the company. “I assume he’ll do a good job there.”) Meanwhile, Amazon is revamping Alexa using much larger language models than it has in the past.
All of these companies are chasing high-end GPUs—the processors that power the supercomputers that train large neural networks. Musk has said that they are now “considerably harder to get than drugs.” Even with GPUs scarce, in recent years the scale of the largest AI training runs has doubled about every six months.
No one has yet outpaced OpenAI, which went all in on GPT-4. Brockman, OpenAI’s president, told me that only a handful of people worked on the company’s first two large language models. The development of GPT-4 involved more than 100, and the AI was trained on a data set of unprecedented size, which included not just text but images too.
When GPT-4 emerged fully formed from its world-historical knowledge binge, the whole company began experimenting wit