Last week I had the privilege to sit down with Sam Altman and 20 other developers to discuss OpenAI’s APIs and their product plans. Sam was remarkably open. The discussion touched on practical developer issues as well as bigger-picture questions related to OpenAI’s mission and the societal impact of AI. Here are the key takeaways:
1. OpenAI is heavily GPU limited at present
A common theme that came up throughout the discussion was that currently OpenAI is extremely GPU-limited and this is delaying a lot of their short-term plans. The biggest customer complaint was about the reliability and speed of the API. Sam acknowledged their concern and explained that most of the issue was a result of GPU shortages.
The longer 32k context can’t yet be rolled out to more people. OpenAI haven’t overcome the O(n^2) scaling of attention and so whilst it seemed plausible they would have 100k – 1M token context windows soon (this year) anything bigger would require a research breakthrough.
The finetuning API is also currently bottlenecked by GPU availability. They don’t yet use efficient finetuning methods like Adapters or LoRa and so finetuning is very compute-intensive to run and manage. Better support for finetuning will come in the future. They may even host a marketplace of community contributed models.
Dedicated capacity offering is limited by GPU availability. OpenAI also offers dedicated capacity, which provides customers with a private copy of the model. To access this service, customers must be willing to commit to a $100k spend upfront.
2. OpenAI’s near-term roadmap
Sam shared what he saw as OpenAI’s provisional near-term roadmap for the API.
2023:
- Cheaper and faster GPT-4 — This is their top priority. In general, OpenAI’s aim is to drive “the cost of intelligence” down as far as possible and so they will work hard to c