
AGI Is Still 30 Years Away – Ege Erdil and Tamay Besiroglu by Philpax
Ege Erdil and Tamay Besiroglu have 2045+ timelines, think the whole “alignment” framing is wrong, don’t think an intelligence explosion is plausible, but are convinced we’ll see explosive economic growth.
This discussion offers a totally different scenario than my recent interview with Scott and Daniel.
Ege and Tamay are the co-founders of Mechanize, a startup dedicated to fully automating work. Before founding Mechanize, Ege and Tamay worked on AI forecasts at Epoch AI.
Watch on Youtube; listen on Apple Podcasts or Spotify.
-
WorkOS makes it easy to become enterprise-ready. With simple APIs for essential enterprise features like SSO and SCIM, WorkOS helps companies like Vercel, Plaid, and OpenAI meet the requirements of their biggest customers. To learn more about how they can help you do the same, visit workos.com
-
Scale’s Data Foundry gives major AI labs access to high-quality data to fuel post-training, including advanced reasoning capabilities. If you’re an AI researcher or engineer, learn about how Scale’s Data Foundry and research lab, SEAL, can help you go beyond the current frontier at scale.com/dwarkesh
-
Google’s Gemini Pro 2.5 is the model we use the most at Dwarkesh Podcast: it helps us generate transcripts, identify interesting clips, and code up new tools. If you want to try it for yourself, it’s now available in Preview with higher rate limits! Start building with it today at aistudio.google.com.
To sponsor a future episode, visit dwarkesh.com/advertise.
(00:00:00) – AGI will take another 3 decades
(00:22:27) – Even reasoning models lack animal intelligence
(00:45:04) – Intelligence explosion
(01:00:57) – Ege & Tamay’s story
(01:06:24) – Explosive economic growth
(01:33:00) – Will there be a separate AI economy?
(01:47:08) – Can we predictably influence the future?
(02:19:48) – Arms race dynamic
(02:29:48) – Is superintelligence a real thing?
(02:35:45) – Reasons not to expect explosive growth
(02:49:00) – Fully automated firms
(02:54:43) – Will central planning work after AGI?
(02:58:20) – Career advice
Dwarkesh Patel 00:00:00
Today, I’m chatting with Tamay Besiroglu and Ege Erdil. They were previously running Epoch AI and are now launching Mechanize, which is a company dedicated to automating all work. One of the interesting points you made recently, Tamay, is that the whole idea of the intelligence explosion is mistaken or misleading. Why don’t you explain what you’re talking about there?
Tamay Besiroglu 00:00:22
Yeah, I think it’s not a very useful concept. It’s kind of like calling the Industrial Revolution a horsepower explosion. Sure, during the Industrial Revolution, we saw this drastic acceleration in raw physical power, but there are many other things that were maybe equally important in explaining the acceleration of growth and technological change that we saw during the Industrial Revolution.
Dwarkesh Patel 00:00:42
What is a way to characterize the broader set of things that the horsepower perspective would miss about the Industrial Revolution?
Tamay Besiroglu 00:00:50
So I think in the case of the Industrial Revolution, it was a bunch of these complementary changes to many different sectors in the economy. So you had agriculture, you had transportation, you had law and finance, you had urbanization and moving from rural areas into cities. There were just many different innovations that happened simultaneously that gave rise to this change in the way of economically organizing our society.
It wasn’t just that we had more horsepower. I mean, that was part of it, but that’s not the kind of central thing to focus on when thinking about the Industrial Revolution. And I think similarly, for the development of AI, sure, we’ll get a lot of very smart AI systems, but that will be one part among very many different moving parts that explain why we expect to get this transition and this acceleration and growth and technological change.
Dwarkesh Patel 00:01:46
I want to better understand how you think about that broader transformation. Before we do, the other really interesting part of your worldview is that you have longer timelines to get to AGI than most of the people in San Francisco who think about AI. When do you expect a drop-in remote worker replacement?
Ege Erdil 00:02:05
Maybe for me, that would be around 2045.
Dwarkesh Patel 00:02:10
Wow. Wait, and you?
Tamay Besiroglu 00:02:11
Again, I’m a little bit more bullish. I mean, it depends what you mean by “drop in remote worker“ and whether it’s able to do literally everything that can be done remotely, or do most things.
Ege Erdil 00:02:21
I’m saying literally everything.
Tamay Besiroglu 00:02:22
For literally everything. Just shade Ege’s predictions by five years or by 20% or something.
Dwarkesh Patel 00:02:27
Why? Because we’ve seen so much progress over even the last few years. We’ve gone from Chat GPT two years ago to now we have models that can literally do reasoning, are better coders than me, and I studied software engineering in college. I mean, I did become a podcaster, I’m not saying I’m the best coder in the world.
But if you made this much progress in the last two years, why would it take another 30 to get to full automation of remote work?
Ege Erdil 00:03:01
So I think that a lot of people have this intuition that progress has been very fast. They look at the trend lines and just extrapolate; obviously, it’s going to happen in, I don’t know, 2027 or 2030 or whatever. They’re just very bullish. And obviously, that’s not a thing you can literally do.
There isn’t a trend you can literally extrapolate of “when do we get to full automation?”. Because if you look at the fraction of the economy that is actually automated by AI, it’s very small. So if you just extrapolate that trend, which is something, say, Robin Hanson likes to do, you’re going to say, “well, it’s going to take centuries” or something.
Now, we don’t agree with that view. But I think one way of thinking about this is how many big things are there? How many core capabilities, competences are there that the AI systems need to be good at in order to have this very broad economic impact, maybe 10x acceleration and growth or something? How many things have you gotten over the past 10 years, 15 years? And we also have this compute-centric view…
Tamay Besiroglu 00:04:05
So just to double click on that, I think what Ege is referring to is, if you look at the past 10 years of AI progress, we’ve gone through about nine or 10 orders of magnitude of compute, and we got various capabilities that were unlocked. So in the early period, people were solving gameplay on specific games, on very complex games. And that happened from 2015 to 2020, Go and Chess and Dota and other games. And then you had maybe sophisticated language capabilities that were unlocked with these large language models, and maybe advanced abstract reasoning and coding and maybe math. That was maybe another big capability that got unlocked.
And so maybe there are a couple of these big unlocks that happened over the past 10 years, but that happened on the order of once every three years or so, or maybe one every three orders of magnitude of compute scaling. And then you might ask the question, “how many more such competencies might we need to unlock in order to be able to have an AI system that can match the capabilities of humans across the board?” Maybe specifically just on remote work tasks. And so then you might ask, well, maybe you need kind of coherence over very long horizons, or you need agency and autonomy, or maybe you need full multimodal understanding, just like a human would.
And then you ask the question, “okay, how long might that take?” And so you can think about, well, just in terms of calendar years, the previous unlocks took about, you get one every three years or so. But of course, that previous period coincided with this rapid scale-up of the amount of compute that we use for training. So we went through maybe 9 or 10 orders of magnitude since AlexNet compared to the biggest models we have today. And we’re getting to a level where it’s becoming harder and harder to scale up compute. And we’ve done some extrapolations and some analysis looking at specific constraints, like energy or GPU production.
And based on that, it looks like we might have maybe three or four orders of magnitude of scaling left. And then you’re really spending a pretty sizable fraction or a non-trivial fraction of world output on just building up data centers, energy infrastructure, fabs, and so on.
Dwarkesh Patel 00:06:40
Which is already like 2% of GDP, right?
Tamay Besiroglu 00:06:42
I mean, currently it’s less than 2%.
Ege Erdil 00:06:44
Yeah, but also currently most of it is actually not going towards AI chips. But even most TSMC capacity currently is going towards mobile phone chips or something like that, right?
Dwarkesh Patel 00:06:52
Even leading edge. It’s like 5% of leading edge.
Tamay Besiroglu 00:06:55
Yeah, even leading edge is pretty small. But yeah, so that suggests that we might need a lot more compute scaling to get these additional capabilities to be unlocked. And then there’s a question of do we really have that in us as an economy to be able to sustain that scaling?
Dwarkesh Patel 00:07:14
But it seems like you have this intuition that there’s just a lot left to intelligence. When you play with these models, they’re almost there. You forget you’re often talking to an AI.
Ege Erdil 00:07:26
What do you mean they’re almost there? I don’t know. I can’t ask Claude to pick up this cup and put it over there.
Dwarkesh Patel 00:07:31
Remote work, you know?
Ege Erdil 00:07:32
Okay. But even for remote work, I can’t ask Claude to… I think the current computer use systems can’t even book a flight properly.
Dwarkesh Patel 00:07:38
How much of an update would it be if by the end of 2026, they could book a flight?
Ege Erdil 00:07:43
I probably think by the end of this year, they’re going to be able to do that. But that’s a very simple… Nobody gets a job where they’re paid to book flights. That’s not a task.
Dwarkesh Patel 00:07:54
I think some people do.
Tamay Besiroglu 00:07:56
If it’s literally just a book flight job, and without-
Ege Erdil 00:08:00
But I think that’s an important point, because a lot of people look at jobs in the economy, and then they’re like, “oh, that person, their job is to just do X”. But then that’s not true. That’s something they do in their job. But if you look at the fraction of their time on the job that they spend on doing that, it’s a very small fraction of what they actually do. It’s just this popular conception people have. Or travel agents, they just book hotels and flights. But that’s not actually most of their job. So automating that actually wouldn’t automate their job, and it wouldn’t have that much of an impact on the economy.
So I think this is actually an important thing, that important worldview difference that separates us from people who are much more bullish, because they think jobs in the economy are much simpler in some sense, and they’re going to take much fewer competences to actually fully automate.
Dwarkesh Patel 00:08:47
So our friend Leopold has this perspective of, quote unquote, ‘unhobblings’, where the way to characterize it might be, they’re basically like baby AGIs already. And then because of the constraints we artificially impose upon them by, for example, only training them on text and not giving them the training data that is necessary for them to understand a Slack environment or a Gmail environment, or previously before inference time scaling, not giving them the chance to meditate upon what they’re saying and really think it through, and not giving them the context about what is actually involved in this job, only giving them this piecemeal, a couple of minutes worth of context in the prompt, we’re holding back what is fundamentally a little intelligence from being as productive as it could be, which implies that unhobblings just seem easier to solve for than entirely new capabilities of intelligence. What do you make of that framework?
Tamay Besiroglu 00:09:46
I mean, I guess you could have made similar points five years ago and say “you look at AlphaZero and there’s this mini AGI there, and if only you unhobbled it by training it on text and giving it all your context” and so on, that just wouldn’t really have worked. I think you do really need to rethink how you train these models in order to get these capabilities.
Dwarkesh Patel 00:10:08
But I think the surprising thing over the last few years has been that you can start off with this pre-trained corpus of the internet, and it’s actually quite easy. ChatGPT is an example of this unhobbling, where 1% of additional compute spent on getting it to talk in a chatbot-like fashion with post training is enough to make it competent- really competent- at that capability.
Reasoning is another example where it seems like the amount of compute that is spent on RL right now in these models is a small fraction of total compute. Again, reasoning seems complicated, and then you just do 1% of compute and it gets you that. Why not think that computer use, or long-term agency on computer use, is a similar thing?
Tamay Besiroglu 00:10:55
So when you say “reasoning is easy” and “it only took this much compute” and “it wasn’t very much”, and maybe “you look at the sheer number of tokens and it wasn’t very much, and so it looks easy”, well, that’s true from our position today. But I think if you ask someone to build a reasoning model in 2015, then it would have looked insurmountable. You would have had to train a model on tens of thousands of GPUs, you would have had to solve that problem, and each order of magnitude of scaling from where they were would pose new challenges that they would need to solve.
You would need to produce internet scale, or tens of trillions of tokens of data in order to actually train a model that has the knowledge that you can then unlock and access by way of training it to be a reasoning model. You need to maybe make the model more efficient at doing inference and maybe distill it, because if it’s very slow then you have a reasoning model that’s not particularly useful, so you also need to make various innovations to get the model to be distilled so that you can train it more quickly, because these rollouts take very long.
It actually becomes a product that’s valuable if it’s a couple tokens a second, as a reasoning model that would have been very difficult to work with. So in some sense, it looks easy from our point of view, standing on this huge stack of technology that we’ve built up over the past five years or so, but at the time, it would have been very hard.
And so my claim would be something like; I think the agency part might be easy in a similar sense, that in five years or three years time or whatever we will look at what unlocked agency and it’ll look fairly simple. But the amount of work that, in terms of these complementary innovations that enable the model to be able to learn how to become a competent agent, that might have just been very difficult and taken years of innovation and a bunch of improvements in kind of hardware and scaling and various other things.
Dwarkesh Patel 00:12:54
Yeah, I feel like what’s dissimilar between 2015 and now… in 2015 if you were trying to solve reasoning, you just didn’t have a base to start on. Maybe if you tried formal proof methods or something, but there was no leg to stand on, where now you’d actually have the thing- you have the pre-trained base model, you have these techniques of scaffolding, of post-training, of RL. And so it seems like you think that those will look to the future as, say, AlphaGo looks to us now in terms of the basis of a broader intelligence.
I’m curious if you have intuitions on why not think that language models as we have them now are like, we got the big missing piece right and now we’re just like plugging things on top of it?
Ege Erdil 00:13:51
Well, I mean, I guess what is the reason for believing that? I mean, you could have looked at AlphaGo or AlphaGo Zero, AlphaZero, those seemed very impressive at the time. I mean, you’re just learning to play this game with no human knowledge, you’re just learning to play it from scratch. And I think at the time it did impress a lot of people. But then people tried to apply it to math, they tried to apply it to other domains, and it didn’t work very well, they weren’t able to get competent agents at math.
So it’s very possible that these models, at least the way we have them right now, you’re going to try to do the same thing people did for reasoning, but for agency, it’s not going to work very well. And then you’re not going to-
Dwarkesh Patel 00:14:32
I’m sorry, you’re saying by the end of 2026, we will have agentic computer use.
Tamay Besiroglu 00:14:36
I think Ege said you’d be able to book a flight, which is very different from having full agentic computer use.
Dwarkesh Patel 00:14:44
I mean, the other things you need to do on a computer are just made up of things like booking a flight.
Ege Erdil 00:14:49
I mean, sure, but they are not disconnected tasks. That’s like saying everything you do in the world is just like you just move parts of your body, and then you move your mouth and your tongue, and then you roll your head. Yeah, individually those things are simple, but then how do you put them together, right?
Dwarkesh Patel 00:15:09
Yeah. Okay. So there’s two pieces of evidence that you can have that are quite dissimilar.
One, the METR eval, which we’ve been talking about privately, which shows that the task length over certain kinds of tasks- I can already see you getting ready. AI’s ability to do the kind of thing that it takes a human 10 minutes to do, or an hour to do, or four hours to do, the length of time for corresponding human tasks, it seems like these models seem to be doubling their task length every seven months. The idea being that by 2030, if you extrapolate this curve, they could be doing tasks that take humans one month to do, or one year to do. And then this long-term coherency in executing on tasks is fundamentally what intelligence is. So this curve suggests that we’re getting there.
The other piece of evidence- I kind of feel like my own mind works this way. I get distracted easily, and it’s hard to keep a long-term plan in my head at the same time. And I’m slightly better at it than these models. But they don’t seem that dissimilar to me. I would have guessed reasoning is just a really complicated thing, and then it seems like, “oh, it’s just something like learning 10 tokens worth of MCTS” of “wait, let’s go back, let’s think about this another way”.
Chain of thought alone just gets you this boost. And it just seems like intelligence is simpler than we thought. Maybe agency is also simpler in this way.
Ege Erdil 00:16:39
Yeah. I mean, I think there’s a reason to expect complex reasoning to not be as difficult as people might have thought, even in advance, because a lot of the tasks that AI solved very early on were tasks of various kinds of complex reasoning. So it wasn’t the kind of reasoning that goes into when a human solves a math problem.
But if you look at the major AI milestones over, I don’t know, since 1950, a lot of them are for complex reasoning. Like chess is, you can say, a complex reasoning task. Go is, you could say, a complex reasoning task.
Dwarkesh Patel 00:17:14
But I think there are also examples of long-term agency. Like winning at Starcraft is an example of being agentic over a meaningful period of time.
Ege Erdil 00:17:24
That’s right. So the problem in that case is that it’s a very specific, narrow environment. You can say that playing Go or playing chess, that also requires a certain amount of agency. And that’s true. But it’s a very narrow task. So that’s like saying if you construct a software system that is able to react to a very specific, very particular kind of image, or very specific video feeds or whatever, then you’re getting close to general sensor motor skill automation.
But the general skill is something that’s very different. And I think we’re seeing that. We still are very far, it seems like, from an AI model that can take a generic game off Steam. Let’s say you just download a game released this year. You don’t know how to play this game. And then you just have to play it. And then most games are actually not that difficult for a human.
Dwarkesh Patel 00:18:21
I mean, what about Claude Plays Pokemon? I don’t think it was trained on Pokemon.
Ege Erdil 00:18:25
Right, so that’s an interesting example. First of all, I find the example very interesting, because yeah, it was not trained explicitly. They didn’t do some RL on playing Pokemon Red. But obviously, the model knows how it’s supposed to play Pokemon Red, because there’s tons of material about Pokemon Red on the internet.
In fact, if you were playing Pokemon Red, and you got stuck somewhere, you didn’t know what to do, you could probably go to Claude and ask “I’m stuck in Mount Moon, and what am I supposed to do?” And then it’s probably able to give you a fairly decent answer. But that doesn’t stop it from getting stuck in Mount Moon for 48 hours. So that’s a very interesting thing, where it has explicit knowledge, but then when it’s actually playing the game, it doesn’t behave in a way which reflects that it has that knowledge.
Dwarkesh Patel 00:19:09
All it’s got to do is plug the explicit knowledge to its actions.
Ege Erdil 00:19:13
Yeah, but is that easy?
Dwarkesh Patel 00:19:15
Okay, if you can leverage your knowledge from pre-training about these games in order to be somewhat competent at them, okay, they’re going to be leveraging a different base of skills. But with that same leverage, they’re going to have a similar repertoire of abilities. If you’ve read everything about whatever skill that every human has ever seen.
Ege Erdil 00:19:43
A lot of the skills that people have, they don’t have very good training data for them.
Dwarkesh Patel 00:19:48
That’s right. What would you want to see over the next few years that would make you think, “oh, no, I’m actually wrong and this was the last unlock, and it was now just a matter of ironing out the kinks”. And then we get the thing that will kick off the, dare I say, intelligence explosion.
Tamay Besiroglu 00:20:04
I think something that would reveal its ability to do very long context things, use multimodal capabilities in a meaningful way, and integrate that with reasoning and other types of systems. And also agency and being able to take action over a long horizon and accomplish some tasks that takes very long for humans to do, not just in specific software environments, but just very broadly; say downloading an arbitrary game from Steam, something that it’s never seen before,
it doesn’t really have much training data, maybe it was released after a training cutoff and so there’s no tutorials or maybe there’s no earlier versions of the game that has been discussed on the Internet, and then accomplishing that game and actually playing that game to the end and accomplishing these various milestones that are challenging for humans. That would be a substantial update. I mean, there are other things that would update me, too, like OpenAI making a lot more revenue than it’s currently doing.
Dwarkesh Patel 00:21:11
Is the hundred billion in revenue that would, according to their contract, mark them as AGI enough?
Tamay Besiroglu 00:21:15
I think that’s not a huge update to me if that were to happen. So I think the update would come if it was, in fact, $500 billion in revenue or something like that. But then I would certainly update quite a lot. But a hundred billion, that seems pretty kind of likely to me. I would assign that maybe a 40 percent chance or something.
Dwarkesh Patel 00:21:37
If you’ve got a system that is, in producer surplus terms, worth a hundred billion. And the difference between this and AlphaZero is AlphaZero is never going to make a hundred billion dollars in the marketplace. So just what is intelligence? It’s like something able to usefully accomplish its goals, or your goals. If people are willing to pay a hundred billion dollars for it, that’s pretty good evidence that it’s like accomplishing some goals.
Tamay Besiroglu 00:22:05
I mean, people pay a hundred billion dollars for all sorts of things. That itself is not a very strong piece of evidence that it’s going to be transformative, I think.
Ege Erdil 00:22:13
People pay trillions of dollars for oil. I don’t know, it seems like a very basic point. But the fact that people pay a lot of money for something doesn’t mean it’s going to transform the world economy if only we manage to unhobble it. Like that’s a very different claim.
Dwarkesh Patel 00:22:27
So then this brings us to the intelligence explosion, because what people will say is, we don’t need to automate literally everything that is needed for automating remote work, let alone all human labor in general. We just need to automate the things which are necessary to fully close the R&D cycle needed to make smarter intelligences.
And if you do this, you get a very rapid intelligence explosion. And the end product of that explosion is not only an AGI, but something that is superhuman potentially. These things are extremely good at coding, and reasoning. It seems like the kinds of things that would be necessary to automate R&D at AI labs. What do you make of that logic?
Ege Erdil 00:24:14
I think if you look at their capability profile, if you compare it to a random job in the economy, I agree they are better at doing coding tasks that will be involved in R&D compared to a random job in the economy. But in absolute terms, I don’t think they’re that good. I think they are good at things that maybe impress us about human coders. If you wanted to see what makes a person a really impressive coder, you might look at their competitive programming performance. In fact, companies often hire people, if they’re relatively junior, based on their performance on these kinds of problems. But that is just impressive in the human distribution.
So if you look in absolute terms at what are the skills you need to actually automate the process of being a researcher, then what fraction of those skills do the AI systems actually have? Even in coding, a lot of coding is, you have a very large code base you have to work with, the instructions are very kind of vague. For example you mentioned METR eval, in which, because they needed to make it an eval, all the tasks have to be compact and closed and have clear evaluation metrics: “here’s a model, get its loss on this data set as low as possible”. Or “here’s another model and its embedding matrix has been scrambled, just fix it to recover like most of its original performance”, etc.
Those are not problems that you actually work on in AI R&D. They’re very artificial problems. Now, if a human was good at doing those problems, you would infer, I think logically, that that human is likely to actually be a good researcher. But if an AI is able to do them, the AI lacks so many other competences that a human would have- not just the researcher, just an ordinary human- that we don’t think about in the process of research. So our view would be that automating research is, first of all, more difficult than people give it credit for. I think you need more skills to do it and definitely more than models are displaying right now.
And on top of that, even if you did automate the process of research, we think a lot of the software progress has been driven not by cognitive effort- that has played a part- but it has been driven by compute scaling. We just have more GPUs, you can do more experiments, to figure out more things, your experiments can be done at larger scales. And that is just a very important driver. If you’re 10 years ago, 15 years ago, you’re trying to figure out what software innovations are going to be important in 10 or 15 years, you would have had a very difficult time. In fact, you probably wouldn’t even have conceived of the right kind of innovations to be looking at, because you would be so far removed from the context of that time with much more abundant compute and all the things that people would have learned by that point.
So these are two components of our view: Research is harder than people think, and depends a lot on compute scale.
Dwarkesh Patel 00:27:17
Can you put a finer point on what is an example of the kind of task which is very dissimilar from ‘train a classifier’ or ‘debug a classifier’ that is relevant to AI R&D?
Tamay Besiroglu 00:27:30
Examples might be introducing novel innovations that are very useful for unlocking innovations in the future. So that might be introducing some novel way of thinking about a problem. A good example might be in mathematics, where we have these reasoning models that are extremely good at solving math problems.
Ege Erdil 00:27:57
Very short horizon.
Tamay Besiroglu 00:28:00
Sure. Maybe not extremely good, but certainly better than I can and better than maybe most undergrads can. And so they can do that very well, but they’re not very good at coming up with novel conceptual schemes that are useful for making progress in mathematics. So it’s able to solve these problems that you can kind of neatly excise out of some very messy context, and it’s able to make a lot of progress there.
But within some much messier context, it’s not very good at figuring out what directions are especially useful for you to build things or make incremental progress on that enables you to have a big kind of innovation later down the line. So thinking about both this larger context, as well as maybe much longer horizon, much fuzzier things that you’re optimizing for, I think it’s much worse at those types of things.
Ege Erdil 00:28:54
Right. So I think one interesting thing is if you just look at these reasoning models, they know so much, especially the larger ones, because they know in literal terms more than any human does in some sense. And we have unlocked these reasoning capabilities on top of that knowledge, and I think that is actually what’s enabling them to solve a lot of these problems. But if you actually look at the way they approach problems, the reason what they do looks impressive to us is because we have so much less knowledge.
And the model is approaching the problems in a fundamentally different way compared to how a human would. A human would have much more limited knowledge, and they would usually have to be much more creative in solving problems because they have this lack of knowledge, while the model knows so much. But you’d ask it some obscure math question where you need some specific theorem from 1850 or something, and then it would just know that, if it’s a large model. So that makes the difficulty profile very different.
And if you look at the way they approach problems, the reasoning models, they are usually not creative. They are very effectively able to leverage the knowledge they have, which is extremely vast. And that makes them very effective in a bunch of ways. But you might ask the question, has a reasoning model ever come up with a math concept that even seems slightly interesting to a human mathematician? And I’ve never seen that.
Dwarkesh Patel 00:30:19
I mean, they’ve been around for all of six months,
Tamay Besiroglu 00:30:23
I mean, that’s a long time. One mathematician might have been able to do a bunch of work over that time, and they have produced orders of magnitude fewer tokens on math.
Ege Erdil 00:30:34
And then I just want to emphasize it, because just think about the sheer scale of knowledge that these models have. It’s enormous from a human point of view. So it is actually quite remarkable that there is no interesting recombination, no interesting, “oh, this thing in this field looks kind of like this thing in this other field”. There’s no innovation that comes out of that. And it doesn’t have to be a big math concept, it could be just a small thing that maybe you could add to a Sunday magazine on math that people used to have. But there isn’t even an example of that.
Tamay Besiroglu 00:31:09
I think it’s useful for us to explain a very important framework for our thinking about what AI is good at and what AI is lagging in, which is this idea of Moravec’s paradox, that things that seem very hard for humans, AI systems tend to make much faster progress on, whereas things that look a bunch easier for us, AI systems totally struggle or are often totally incapable of doing that thing. And so this kind of abstract reasoning, playing chess, playing Go, playing Jeopardy, doing kind of advanced math and solving math problems.
Ege Erdil 00:31:49
There are even stronger examples, like multiplying 100 digit numbers in your head, which is just the one that got solved first out of almost any other problem. Or following very complex symbolic logic arguments, like deduction arguments, which people actually struggle with a lot. Like how do premises logically follow from conclusions? People have a very hard time with that. Very easy for formal proof systems.
Tamay Besiroglu 00:32:12
An insight that is related and is quite important here is that the tasks that humans seem to struggle on and AI systems seem to make much faster progress on are things that have emerged fairly recently in evolutionary time. So, advanced language use emerged in humans maybe 100,000 years ago, and certainly playing chess and Go and so on are very recent innovations. And so evolution has had much less time to optimize for them, in part because they’re very new, but also in part because when they emerged, there was a lot less pressure because it conferred kind of small fitness gains to humans and so evolution didn’t optimize for these things very strongly.
And so it’s not surprising that on these specific tasks that humans find very impressive when other humans are able to do it, that AI systems are able to make a lot of fast progress. In humans, these things are often very strongly correlated with other competencies, like being good at achieving your goals, or being a good coder is often very strongly correlated with solving coding problems, or being a good engineer is often correlated with solving competitive coding problems.
But in AI systems, the correlation isn’t quite as strong. And even within AI systems, it’s the case that the strongest systems on competitive programming are not even the ones that are best at actually helping you code. So o3 mini’s high seems to be maybe the best at solving competitive code problems, but it isn’t the best at actually helping you write code.
Ege Erdil 00:33:54
And it isn’t getting most of the enterprise revenue from places like Coursera or whatever, that’s just Claude, right?
Tamay Besiroglu 00:33:59
But an important insight here is that the things that we find very impressive when humans are able to do it, we should expect that AI systems are able to make a lot more progress on that. But we shouldn’t update too strongly about just their general competence or something, because we should recognize that this is a very narrow subset of relevant tasks that humans do in order to be a competent, economically valuable agent.
Dwarkesh Patel 00:34:26
Yeah. First of all, I actually just really appreciate that there is an AI organization out there where- because there’s other people who take the compute perspective seriously, or try to think empirically about scaling laws and data and whatever. And taking that perspective seriously leads people to just be like, “okay, 2027 AGI”, which might be correct, but it is just interesting to get, “no, we’ve also looked at the exact same arguments, the same papers, the same numbers. And we’ve come to a totally different conclusion”.
So I asked Dario this exact question two years ago, when I interviewed him, and it went viral.
Ege Erdil 00:35:11
Didn’t he say AGI in two years?
Dwarkesh Patel 00:35:13
That, but Dario’s always had short timelines.
Ege Erdil 00:35:15
Okay, but we are two years later.
Dwarkesh Patel 00:35:18
Did he say two years? I think he actually did say two years.
Ege Erdil 00:35:20
Did he say three years?
Tamay Besiroglu 00:35:21
So we have one more year.
Dwarkesh Patel 00:35:22
One more year.
Tamay Besiroglu 00:35:23
Better work hard.
Dwarkesh Patel 00:35:27
But he’s, I mean, I think he’s like, he in particular has not been that well calibrated. In 2018, he had like…
Tamay Besiroglu 00:35:33
I remember talking to a very senior person who’s now at Anthropic, in 2017. And then he told various people that they shouldn’t do a PhD because by the time they completed it everyone will be automated.
Dwarkesh Patel 00:35:49
So anyways, I asked him this exact same question because he has short timelines, which is that if a human knew the amount of things these models know, they would be finding all these different connections. And in fact, I was asking Scott about this the other day when I interviewed him, Scott Alexander, and he said, “look, humans also don’t have this kind of logical omniscience”.
I’m not saying we’re omniscient, but we have examples of humans finding these kinds of connections. This is not an uncommon thing, right? I think his response was that these things are just not trained in order to find these kinds of connections, but their view is that it would not take that much extra compute in order to build some RL environment in which they’re incentivized to find these connections. Next token prediction just isn’t incentivizing them to do this, but the RL required to do this would not be- that or set up some sort of scaffolds. I think actually Google DeepMind did do some similar scaffold to make new discoveries. And I didn’t look into how impressive the new discovery was, they claim that some new discovery was made by an LLM as a result.
On the Moravec paradox thing, this is actually a super interesting way to think about AI progress. But I would also say that if you compare animals to humans, long term intelligent planning… an animal is not gonna help you book a flight either. An animal is not gonna do remote work for you.
I think what separates humans from other animals is that we can hold long-term, we can come up with a plan and execute on it. Whereas other animals often had to go by instinct, or within the kinds of environments that they have evolutionary knowledge of, rather than, “I’m put in the middle of the savanna, or I’m put in the middle of the desert, or I’m put in the middle of tundra, and I’ll learn how to make use of the tools and whatever there”. I actually think there’s a huge discontinuity between humans and animals and their ability to survive in different environments, just based on their knowledge. And so it’s a recently optimized thing as well. And then I’d be like, “okay, well, we got it soon. AIs will optimize for it fast”.
Ege Erdil 00:37:50
Right. So I would say if you’re comparing animals to humans, it’s kind of a different thing.
I think if you could put the competences that the animals have into AI systems, that might just already get you to AGI already. I think the reason why there is such a big discontinuity between animals and humans is because animals have to rely entirely on natural world data, basically, to train themselves. Imagine that the only thing as a human that you saw was nobody talked to you, you didn’t read anything, you just had to learn by experience, maybe to some extent by imitating other people, but you have no explicit communication. It would be very inefficient.
What’s actually happening is that you have this- I think some other people have made this point as well- is that evolution is sort of this outer optimizer that’s improving the software efficiency of the brain in a bunch of ways. There’s some genetic knowledge that you inherit, not that much because there isn’t that much space in the genome. And then you have this lifetime learning, which is, you don’t actually see that much data during lifetime learning. A lot of this is redundant and so on.
So what seems to have changed with humans compared to other animals is that humans became able to have culture and they have language, which enables them to have a much more efficient training data modality compared to animals. They also have, I think, stronger ways in which they tend to imitate other humans and learn from their skills, so that also enables this knowledge to be passed on. I think animals are pretty bad at that compared to humans. So basically as a human, you’re just being trained on much more efficient data and that creates further insights to be then efficient at learning from it, and then that creates this feedback loop where the selection pressure gets much more intense.
So I think that’s roughly what happened with humans. But a lot of the capabilities that you need to be a good worker in the human economy, animals already have. So they have quite sophisticated sensory motor skills. I think they are actually able to pursue long-term goals.
Dwarkesh Patel 00:40:03
But ones that have been instilled by evolution. I think a lion will find a gazelle and that is a complicated thing to do and requires stalking and blah, blah, blah-
Ege Erdil 00:40:12
But when you say it’s been instilled by evolution, there isn’t that much information in the genome.
Dwarkesh Patel 00:40:16
But I think if you put the lion in the Sahara and you’re like, “go find lizards instead”.
Ege Erdil 00:40:22
Okay. So suppose you put a human and they haven’t seen the relevant training data.
Dwarkesh Patel 00:40:27
I think they’d be slightly better.
Ege Erdil 00:40:29
Slightly better, but not that much better. Again, didn’t you recently have an interview?
Dwarkesh Patel 00:40:36
Joseph Henrich.
Ege Erdil 00:40:37
Yeah. So he would probably tell you that.
Dwarkesh Patel 00:40:40
I think what you’re making is actually a very interesting and subtle point that has an interesting implication. So often people say that ASI will be this huge discontinuity, because while we have this huge discontinuity in the animal-to-human transition, not that much changed between pre-human primates and humans genetically, but it resulted in this humongous change in capabilities. And so they say, “well, why not expect something similar between human level intelligence and superhuman intelligence?”
And one implication of the point you’re making is actually it wasn’t that we just gained this incredible intelligence. Because of biological constraints, animals have just been held back in this really weird way that no AI system has been arbitrarily held back from not being able to communicate with other copies or with other knowledge sources. And so since AIs are not held back artificially in this way, there’s not going to be a point where we should take away that hobbling. And then now they explode.
Now, actually, I think I would disagree with that. The implication that I made, I would actually disagree with- I’m like a sort of unsteerable chain of thought.
We wrote a blog post together about AI corporations where we discuss actually there will be a similar unhobbling with future AIs, which is not about the intelligence, but a similar level of bandwidth and communication and collaboration with other AIs, which is a similar magnitude of change from non-human animals to humans, in terms of their social collaboration, that AIs will have with each other because of their ability to copy all their knowledge exactly, to merge, to distill themselves.
Tamay Besiroglu 00:42:28
Maybe before we talk about that, I think just a very important point to make here, which I think underlies some of this disagreement that we have with others about both this argument from the transition from kind of non-human animals to humans, is this focus on intelligence and reasoning and R&D, which is enabled by that intelligence as being enormously important. And so if you think that you get this very important difference from this transition from non-human primates to humans, then you think that in some sense you get this enormously important unlock from fairly small scaling and, say, brain size or something.
And so then you might think, “well, if we scale beyond the size of training runs, the amount of training compute that the human brain uses, which is maybe on the order of 1E24 flop or whatever, which we’ve recently surpassed, then maybe surpassing it just a little bit more enables us to unlock very sophisticated intelligence in the same way that humans have much more sophisticated intelligence compared to non-human primates”. And I think part of our disagreement is that intelligence is kind of important, but just having a lot more intelligence and reasoning and good reasoning isn’t something that will kind of accelerate technological change and economic growth very substantially.
It isn’t the case that the world today is totally bottlenecked by not having enough good reasoning, that’s not really what’s bottlenecking the world’s ability to grow much more substantially. I think that we might have some disagreement about this particular argument, but I think what’s also really important is just that we have a different view as to how this acceleration happens, that it’s not just having a bunch of really good reasoners that give you this technology that then accelerates things very drastically. Because that alone is not sufficient. You need kind of complementary innovations in other industries. You need the economy as a whole growing and supporting the development of these various technologies. You need the various supply chains to be upgraded. You might need demand for the various products that are being built.
And so we have this view where actually this very broad upgrading of your technology and your economy is important rather than just having very good reasoners and very, very, very good reasoning tokens that gives us this acceleration.
Dwarkesh Patel 00:45:04
All right. So this brings us back to the intelligence explosion. Here’s the argument for the intelligence explosion:
You’re right that certain kinds of things might take longer to come about, but this core loop of software R&D that’s required, if you just look at what kinds of progress is needed to make a more general intelligence, you might be right that it needs more experimental compute, but as you guys have documented, we’re just getting a shit-ton more compute every single year for the next few years. So you can imagine an intelligence explosion in the next few years where in 2027, there’ll be like 10 X more compute than there is now for AI.
And you’ll have this effect where the AIs that are doing software R&D are finding ways to make running copies of them more efficient, which has two effects. One, you’re increasing the population of AIs who are doing this research, so more of that in parallel can find these different optimizations. And a subtle point that they’d often make here is software R&D in AI is not just Ilya-type coming up with new transformer-like architectures.
To your point, it actually is a lot of- I mean, I’m not an AI researcher, but I assume there’s, from the lowest level libraries to the kernels, to making RL environments, to finding the best optimizer, to… there’s just so much to do, and in parallel you can be doing all these things or finding optimizations across them. And so you have two effects, going back to this. One is, if you look at the original GPT-4 compared to the current GPT-4o, I think it’s, what, how much cheaper is it to run?
Tamay Besiroglu 00:46:57
It’s like, maybe a hundred times for the same capability or something.
Dwarkesh Patel 00:47:03
Right. So they’re finding ways in which to run more copies of them at a hundred X cheaper or something, which means that the population of them is increasing and the higher populations are helping you find more efficiencies.
And not only does that mean you have more researchers, but to the extent that the complementary input is experimental compute, it’s not the compute itself, it’s the experiments.
And the more efficient it is to run a copy or to develop a copy, the more parallel experiments you can run, because now you can do a GPT-4 scale training run for much cheaper than you could do it in 2024 or 2023. And so for that reason, also this software-only singularity sees more researcher copies who can run experiments for cheaper, dot, dot, dot. They initially are maybe handicapped in certain ways that you mentioned, but through this process, they are rapidly becoming much more capable. What is wrong with this logic?
Tamay Besiroglu 00:47:57
So I think the logic seems fine. I think this is like a decent way to think about this problem, but I think that it’s useful to draw on a bunch of work that, say, economists have done for studying the returns to R&D and what happens if you 10X your inputs, the number of researchers, what happens to innovation or the rate of innovation.
And there, they point out these two effects where, as you do more innovation and you get to stand on top of the shoulders of giants and you get the benefit from past discoveries and it makes you as a scientist more productive. But then there’s also kind of diminishing returns, that the low hanging fruit has been picked, and it becomes harder to make progress. And overall, you can summarize those estimates as thinking about the kind of returns to research effort.
And we’ve looked into the returns to research effort in software specifically. And we look at a bunch of domains in traditional software or linear integer solvers or SAT solvers, but also in AI; computer vision and RL and language modeling. And there, if this model is true, that all you need is just cognitive effort, it seems like the estimates are a bit ambiguous about whether this results in this acceleration or whether it results in just merely exponential growth.
And then you might also think about, well, it isn’t just your research effort that you have to scale up to make these innovations, because you might have complementary inputs. So as you mentioned, experiments are the thing that might kind of bottleneck you. And I think there’s a lot of evidence that in fact, these experiments and scaling up hardware, it’s just very important for getting progress in the algorithms and the architecture and so on. So in AI- this is true for software in general- where if you look at progress in software, it often matches very closely the rate of progress we see in hardware. So for traditional software, we see about a 30% roughly increase per year, which kind of basically matches Moore’s law. And in AI, we’ve seen the same until you get to the deep learning era, and then you get this acceleration, which in fact coincides with the acceleration we see in compute scaling, which gives you a hint that actually the compute scaling might have been very important.
Other pieces of evidence besides this coincidental rate of progress, other pieces of evidence are the fact that innovation and algorithms and architectures are often concentrated in GPU-rich labs and not in the GPU-poor parts of the world, like academia or maybe smaller research institutes. That also suggests that having a lot of hardware is very important. If you look at specific innovations that seem very important, the big innovations over the past five years, many of them have some kind of scaling or hardware-related motivation. So you might look at how the transformer itself was about how to harness more parallel compute. Things like flash attention was literally about how to implement the attention mechanism more efficiently, or things like the chinchilla scaling law.
And so many of these big innovations were just about how to harness your compute more effectively. That also tells you that actually the scaling of compute might be very important. And I think there’s just many pieces of evidence that point towards this complementarity picture.
So I would say that even if you assume that experiments are not particularly important, the evidence we have, both from estimates of AI and other software- although the data is not great- suggests that maybe you don’t get this kind of hyperbolic, faster-than-exponential super-growth in the overall algorithmic efficiency of systems.
Dwarkesh Patel 00:51:56
I’m not sure I buy the argument that because these two things compute and AI progress have risen so concomitantly that this is a sort of causal relationship.
So broadly, the industry as a whole has been getting more compute and as a result, making more progress. But if you look at the top players, there’s been multiple examples of a company with much less compute, but a more coherent vision, more concentrated research effort, being able to beat an incumbent who has much more compute. So OpenAI initially beating Google DeepMind. And if you remember, there were these emails that were released between Elon and Sam and so forth like, “we got to start this company because they’ve got this bottleneck on the compute” and, “look how much more compute Google DeepMind has”. And then OpenAI made a lot of progress. Similarly now with OpenAI versus Anthropic and so forth. And then I think just generally, your argument is just too ‘outside view’. And we actually do know a lot about this very macro economic argument that I’m like, well, why don’t we just ask the AI researchers?
Tamay Besiroglu 00:53:01
I mean, AI researchers will often kind of overstate the extent to which just cognitive effort and doing research is important for driving these innovations, because that’s often convenient or useful. They will say the insight was derived from some nice idea about statistical mechanics or some nice equation in physics that says that we should do it this way. But often that’s an ad hoc story that they tell to make it a bit more compelling to reviewers.
Dwarkesh Patel 00:53:35
So Daniel Kokotajlo mentioned this survey he did where he asked a bunch of AI researchers, “if you had one thirtieth the amount of compute”- and he did one thirtieth because AI’s will be, they suppose, 30 times faster- “If you had one thirtieth the amount of compute, how much would your progress slow down?” And they say, “I make a third of the amount of progress I normally do”. So that’s just a pretty good substitution effect of, you get one tenth the compute, your progress only goes down one third.
And then I was talking to an AI researcher the other day, one of these cracked people, gets paid tens of millions of dollars a year, probably. And we asked him, how much does the AI models help you in AI research? And he said, “in domains that I’m already quite familiar with, where it’s closer to autocomplete, it saves me four to eight hours a week”. And then he said, “but in domains where I’m actually less familiar, where I need to drive new connections, I need to understand how these different parts relate to each other, and so forth. It saves me close to 24 to 36 hours a week”.
And that’s the current models. And I’m just like, “he didn’t get more computed, but it still saved him like a shit ton more time”. Just draw that forward. That’s a crazy implication or crazy trend, right?
Ege Erdil 00:54:58
I mean, I’m skeptical of the claims that we have actually seen that much of an acceleration in the process of R&D. These claims seem to me, like they’re not borne out by the actual data I’m seeing. So I’m not sure how much to trust them.
Dwarkesh Patel 00:55:18
I mean, on the general intuition that cognitive effort alone can give you a lot of AI progress, it seems like a big important thing the labs do is this science of deep learning. Scaling laws… I mean, it ultimately netted out in an experiment, but the experiment is motivated by cognitive effort.
Ege Erdil 00:55:36
So for what it’s worth, when you say that A and B are complementary, you’re not saying, just as A can bottleneck you, B can also bottleneck you. So when you say you need compute and experiments and data, but you also need cognitive effort, that doesn’t mean the lab which has the most compute is going to win, right? That’s a very simple point, either one can be the bottleneck.
I mean, if you just have a really dysfunctional culture and you don’t actually prioritize using your computer very well and you just waste it, well then you’re not going to make a lot of progress, right? So it doesn’t contradict the picture that someone with a much better vision, a much better team, much better prioritization can make better use of their compute if someone else was just bottlenecked heavily on that part of the equation. The question here is, once you get these automated AI researchers and you start this software singularity, your software efficiency is going to improve by many orders of magnitude, while your compute stock, at least in the short run, is going to remain fairly fixed. So how many OOMs of improvement can you get before you become bottlenecked by the second priority equation? And once you actually factor that in, like how much progress should you expect?
That’s the kind of question I think people don’t have. I think it’s hard for people to have good intuitions about this because people usually don’t run the experiments. So you don’t get to see at a company level, or at an industry level, what would have happened if the entire industry had 30 times less compute. Maybe as an individual, what would happen if you had 30 times less compute? You might have a better idea about that, but that’s a very local experiment and you might be benefiting a lot from spillovers from other people who actually have more co
32 Comments
pelagicAustral
Two more weeks
codingwagie
I just used o3 to design a distributed scheduler that scales to 1M+ sxchedules a day. It was perfect, and did better than two weeks of thought around the best way to build this.
andrewstuart
LLMs are basically a library that can talk.
That’s not artificial intelligence.
EliRivers
Would we even recognise it if it arrived? We'd recognise human level intelligence, probably, but that's specialised. What would general intelligence even look like.
cruzcampo
AGI is never gonna happen – it's the tech equivalent of the second coming of Christ, a capitalist version of the religious savior trope.
moralestapia
"Literally who" and "literally who" put out statements while others out there ship out products.
Many such cases.
dicroce
Doesn't even matter. The capabilities of the AI that's out NOW will take a decade or more to digest.
_Algernon_
The new fusion power
fusionadvocate
Can someone throw some light on this Dwarkesh character? He landed a Zucc podcast pretty early on… how connected is he? Is he an industry plant?
dcchambers
And in 30 years it will be another 30 years away.
LLMs are so incredibly useful and powerful but they will NEVER be AGI. I actually wonder if the success of (and subsequent obsession with) LLMs is putting true AGI further out of reach. All that these AI companies see are the $$$. When the biggest "AI Research Labs" like OpenAI shifted to product-izing their LLM offerings I think the writing was on the wall that they don't actually care about finding AGI.
throw7
AGI is here today… go have a kid.
ksec
Is AGI even important? I believe the next 10 to 15 years will be Assisted Intelligence. There are things that current LLM are so poor I dont believe a 100x increase in pref / watt is going to make much difference. But it is going to be good enough there wont be an AI Winter. Since current AI has already reached escape velocity and actually increase productivity in many areas.
The most intriguing part is if Humanoid factory worker programming will be made 1000 to 10,000x more cost effective with LLM. Effectively ending all human production. I know this is a sensitive topic but I dont think we are far off. And I often wonder if this is what the current administration has in sight. ( Likely Not )
csours
1. LLM interactions can feel real. Projections and psychological mirroring is very real.
2. I believe that AI researchers will require some level of embodiment to demonstrate:
a. ability to understand the physical world.
b. make changes to the physical world.
c. predict the outcome to changes in the physical world.
d. learn from the success or failure of those predictions and update their internal model of the external world.
—
I cannot quickly find proposed tests in this discussion.
lo_zamoyski
Thirty years. Just enough time to call it quits and head to Costa Rica.
xnx
I'll take the "under" on 30 years. Demis Hassabis (who has more credibility than whoever these 3 people are combined) says 5-10 years: https://time.com/7277608/demis-hassabis-interview-time100-20…
arkj
”‘AGI is x years away’ is a proposition that is both true and false at the same time. Like all such propositions, it is therefore meaningless.”
antisthenes
You cannot have AGI without a physical manifestation that can generate its own training data based on inputs from the external outside world with e.g. sensors and constantly refine its model.
Pure language or pure image-models are just one aspect of intelligence – just very refined pattern recognition.
You will also probably need some aspect of self-awareness in order or the system to set auxiliary goals and directives related to self-maintenance.
But you don't need AGI in order to have something useful (which I think a lot of readers are confused about). No one is making the argument that you need AGI to bring tons of value.
Zambyte
Related: https://en.wikipedia.org/wiki/AI_effect
kgwxd
Again?
lucisferre
Huh, so it should be ready around the same time as practical fusion reactors then. I'll warm up the car.
shortrounddev2
Hopefully more!
yibg
Might as well be 10 – 1000 years. Reality is no one knows how long it'll take to get to AGI, because:
1) No one knows what exactly makes humans "intelligent" and therefore
2) No one knows what it would take to achieve AGI
Go back through history and AI / AGI has been a couple of decades away for several decades now.
ValveFan6969
I do not like those who try to play God. The future of humanity will not be determined by some tech giant in their ivory tower, no matter how high it may be. This is a battle that goes deeper than ones and zeros. It's a battle for the soul of our society. It's a battle we must win, or face the consequences of a future we cannot even imagine… and that, I fear, is truly terrifying.
sebastiennight
The thing is, AGI is not needed to enable incredible business/societal value, and there is good reason to believe that actual AGI would damage both our society, our economy, and if many experts in the field are to be believed, humanity's survival as well.
So I feel happy that models keep improving, and not worried at all that they're reaching an asymptote.
stared
My pet peeve: talking about AGI without defining it. There’s no consistent, universally accepted definition. Without that, the discussion may be intellectually entertaining—but ultimately moot.
And we run into the motte-and-bailey fallacy: at one moment, AGI refers to something known to be mathematically impossible (e.g., due to the No Free Lunch theorem); the next, it’s something we already have with GPT-4 (which, while clearly not superintelligent, is general enough to approach novel problems beyond simple image classification).
There are two reasonable approaches in such cases. One is to clearly define what we mean by the term. The second (IMHO, much more fruitful) is to taboo your words (https://www.lesswrong.com/posts/WBdvyyHLdxZSAMmoz/taboo-your…)—that is, avoid vague terms like AGI (or even AI!) and instead use something more concrete. For example: “When will it outperform 90% of software engineers at writing code?” or “When will all AI development be in hands on AI?”.
dmwilcox
I've been saying this for a decade already but I guess it is worth saying here. I'm not afraid AI or a hammer is going to become intelligent (or jump up and hit me in the head either).
It is science fiction to think that a system like a computer can behave at all like a brain. Computers are incredibly rigid systems with only the limited variance we permit. "Software" is flexible in comparison to creating dedicated circuits for our computations but is nothing by comparison to our minds.
Ask yourself, why is it so hard to get a cryptographically secure random number? Because computers are pure unadulterated determinism — put the same random seed value in your code and get the same "random numbers" every time in the same order. Computers need to be like this to be good tools.
Assuming that AGI is possible in the kinds of computers we know how to build means that we think a mind can be reduced to a probabilistic or deterministic system. And from my brief experience on this planet I don't believe that premise. Your experience may differ and it might be fun to talk about.
In Aristotle's ethics he talks a lot about ergon (purpose) — hammers are different than people, computers are different than people, they have an obvious purpose (because they are tools made with an end in mind). Minds strive — we have desires, wants and needs — even if it is simply to survive or better yet thrive (eudaimonia).
An attempt to create a mind is another thing entirely and not something we know how to start. Rolling dice hasn't gotten anywhere. So I'd wager AGI somewhere in the realm of 30 years to never.
lexarflash8g
Apparently Dwarkesh's podcast is a big hit in SV — it was covered by the Economist just recently. I thought the "All in" podcast was the voice of tech but their content has been going politcal with MAGA lately and their episodes are basically shouting matches with their guests.
And for folks who want to read rather than listen to a podcast, why not create an article (they are using Gemini) rather than just posting the whole transcript? Who is going to read a 60 min long transcript?
swframe2
The Anthropic's research on how LLMs reason shows that LLMs are quite flawed.
I wonder if we can use an LLM to deeply analyze and fix the flaws.
colesantiago
This "AGI" definition is extremely loose depending on who you talk to. Ask "what does AGI mean to you" and sometimes the answer is:
1. Millions of layoffs across industries due to AI with some form of questionable UBI (not sure if this works)
2. 100BN in profits. (Microsoft / OpenAI definition)
3. Abundance in slopware. (VC's definition)
4. Raise more money to reach AGI / ASI.
5. Any job that a human can do which is economically significant.
6. Safe AI (Researchers definition).
7. All the above that AI could possibly do better.
I am sure there must be a industry aligned and concrete definition that everyone can agree on rather the goal post moving definitions.
alecco
Is it me or the signal/noise is needle in a haystack for all these cheerleader tech podcasts? In general, I really miss the podcast scene from 10 years ago, less polished but more human and with reasonable content. Not this speculative blabber that seems to be looking to generate clickbait clips. I don't know what happened a few years ago, but even solid podcasts are practically garbage now.
I used to listen to podcasts daily for at least an hour. Now I'm stuck with uploading blogs and pdfs to Eleven Reader. I tried the Google thing to make a podcast but it's very repetitive and dumb.
ChicagoDave
You can’t put a date on AGI until the required technology is invented and that hasn’t happened yet.
owenthejumper
I "love" how the interviewer keeps conflating intelligence with "Hey OpenAI will make $100b"