Disclaimer: The views and opinions expressed in this blog are entirely my own and do not necessarily reflect the views of my current or any previous employer. This blog may also contain links to other websites or resources. I am not responsible for the content on those external sites or any changes that may occur after the publication of my posts.
End Disclaimer
I wasn’t going to write anything on DeepSeek because everybody already has, but I’ve been asked enough times by people since the release that I’ll put my thoughts down in writing. (First mistake)
Everyone is giving you their 2 cents. I’m going to give you 1 cent more for a total of My 3 Cents (future autobiography title).
Everything about this story seems excessive, so I thought I’d add to it (1 cent more).
On January 10th, 2025, a year-old Chinese startup called DeepSeek released a chatbot based on their large language model called DeepSeek-R1. This, in and of itself, was not news- AI companies have been releasing versions of their models at some frequency for a while now. What was news, however, was that the model cost a reported ~$5-$6 million dollars to build and was roughly on par with Open AI’s state of the art reasoning model -o1. The 6$ million dollars represented, by some accounts, a 50x reduction in the cost of building a state of the art LLM. Things percolated for a week or two, but it’s important to remember that things for the most part were really quiet. People seemed to be more or less digesting the information and recognizing it as an impressive improvement in model methodology, training, and cost reduction.
But then, all of a sudden…
Sunday 26th, the DeepSeek app is the 1st or 2nd most downloaded free app in the iOS store in the US.
That Sunday night, the futures markets are down, various tech pundits are calling this AI’s “Sputnik moment”, and the bellwether of all AI stocks, Nvidia, is down 11% in the premarket.
The market’s were freaking out. This is it everybody- NVDA finally cracked. It’s over.
NVDA would end the day that Monday, down ~18%, a loss of 590B dollars in market cap. The single biggest one day loss of any company…in history.
The sheer level of reaction to DeepSeek has been amazing.
Hyperbole.
Hysteria.
DeepSeek is impressive. Among the purported improvements is showing that pure reinforcement learning (RL), without initial supervised learning, can develop sophisticated reasoning capabilities, with further improvements p