Note: This podcast is designed to be heard. If you are able, we strongly encourage you to listen to the
audio, which includes emphasis that’s not on the page
Introduction
Adam:
Hello and welcome to CoRecursive. I’m Adam Gordon Bell. Each episode of CoRecursive, someone shares the fascinating story behind some piece of software being built. On April 1st, 2014, an open source maintainer got an email from Google about a security issue, and this was not an April Fool’s joke. This was HeartBleed and the project was OpenSSL. 17% of the world’s web servers were affected, and by the time the dust settled, people started asking questions like, “Why was this open source maintainer who received only $2,000 in donations a year responsible for 17% of the world’s encrypted web traffic?”
Ever since then, I’ve been curious about these critical pieces of infrastructure. What happens if your fun side project ended up powering the world? Do you try to monetize it? Do you focus on it full time? Does the weight of the maintenance crush you and you just leave computers and go to focus on building furniture? I have a purpose guest for discussing this topic.
Richard:
I’m Richard Hipp and I work on SQLite.
Adam:
Today’s show, I’m talking to Richard about how to survive becoming core infrastructure for the world. SQLite is everywhere. It’s in your web browser, it’s in your phone, it’s probably in your car, and it’s definitely in commercial planes. It’s where your iMessages and WhatsApp messages are stored, and if you just do a find on your computer for *.db, you’ll be amazed at how many SQLite databases you find. Today, Richard is going to share his story. The idea for SQLite actually came out of his frustrations with an existing database called Informix that was installed on a literal battleship.
The Battleship
Adam:
Richard was a contractor for Bath Iron Works working on software for the DDG-79 Oscar Austin. That is a battleship, the type that protects a fleet by being armed to the hilt.
Richard:
There’s a big, complex ship, and stuff’s always breaking. Suppose a pipe ruptures. You need to isolate that damage by closing valves on either side of the pipe, but then you also need to open valves elsewhere to restore the working fluid to other systems that are downstream so that they don’t go offline, and locating all those valves and whether you open them or close them can get very complicated, and so Automated Common Diagrams is a program that says, “Oh, here’s the problem. Here’s the valves you close. Here’s the valves you open. Here’s where they’re located.”
That was the original problem, and all the data for where all the pipes are running and all the valves are located, that was in the database. The computer was already installed on the ship. We didn’t have any control over that. The database was already installed on the ship. We didn’t have any control over that. We just had to use what was there.
NP-Complete Problems
Adam:
Richard was brought in because the solution to this problem was computationally complex and Richard was known for solving hard problems.
Richard:
Really, when you come right down to it, the types of systems that are designed by humans tend to be solvable in polynomial time. It’s just, the general description of the problem, where you have an arbitrary directed graph, is NP-complete, so, they were trying to write code that would solve this, and they hadn’t analyzed it and they didn’t realize this. They were, “You know, we’re not getting a solution. It’s just running forever and chewing up CPU cycles. What’s going on?” Well, that’s because it’s in NP-complete, and so you have to use heuristics that will find fast approximate solutions and put lots of things in there to verify that it’s not stuck in a loop somehow, and really, for the way they design these ships, the heuristics can find the exact solution, the optimal solution pretty quickly in every case, but it’s just, you can’t write a simple, naïve algorithm and expect it to finish quickly because you will get stuck in an exponential search, trying every valve combination to see which one’s going to give you the best solution.
I was leading a team that was working on this, but Informix just wasn’t really working really well. Once it was working, it worked great, but sometimes the server would go down, and then our application wouldn’t run, and that was embarrassing. Dialog box would pop. They’d double click on the thing and a dialog box would pop that says, “Can’t connect to database server,” and it wasn’t our fault. We didn’t have any control over the database server, but what do you do if you can’t connect to the server, so we got the blame all the same because we were painting the dialog box.
Adam:
Yeah. I can imagine, when some pipe bursts and they try to use your program and they get a database error, they’re not too happy.
Richard:
No. No, and of course, it’s a war ship, so, of course, things are always breaking and they use it all the time, but the idea is it’s supposed to be able to work if you take battle damage, so it’s more than one pipe breaking and there’s going to be a lot of stuff broke, and people are going to be crazy and there’s going to be smoke and blood and chaos, and in a situation like that they don’t want a dialog box that says, “Cannot connect to database server.” That’s just not what they want to see, so it needs to be reliable. All we’re doing is reading the data into RAM. We’re not doing transactions. We’re not doing anything like that. It’s just, we’re pulling a bunch of data into memory so that we can solve this problem.
Why do we even need a server? Why can’t I pull this directly off the disk drive? That way if the computer is healthy enough, it can run our application at all, we don’t have dependencies that can fail and cause us to fail, and I looked around and there were no SQL database engines that would do that, and one of the guys I was working with says, “Richard, why don’t you just write one?” “Okay, I’ll give it a try.” I didn’t do that right away, but later on, it was a funding hiatus. This was back in 2000, and if I recall correctly, Newt Gingrich and Bill Clinton were having a fight of some sort, so all government contracts got shut down, so I was out of work for a few months, and I thought, “Well, I’ll just write that database engine now.”
Building SQLite V1
Adam:
This is the year 2000. Wikipedia didn’t exist yet. People with internet were mainly using dial-up, and only 1% of US households had broadband internet. You couldn’t just Google how to build a database and get pointed in the right direction, but Richard had a plan based on his previous experience building compilers.
Richard:
If we think about each SQL statement as a program, my task is to take that program and compile it into some sort of executable code, so I wrote a byte code engine that would actually run a query and then I wrote a compiler that would translate SQL into that byte code and voila, SQLite was born. It wasn’t really used for that project that I was working on, because, well, that one was shut down at the time, but it started back up later, and we incorporated it into that project for testing purposes because the customer’s insisting on Informix, so that’s fine, but Informix is a real hassle to use for development.
For development purposes, we would use my engine just for testing and whatnot, but it was never an official part of the project, but I put it out there on the internet and other people started picking it up. I remember, this was before Twitter or anything like that, but there was netnews back then, and [inaudible 00:07:42] somebody put a posting [inaudible 00:07:43], was like, “Wow, I’ve got an SQL database running on my Palm Pilot.” I’m not kidding. It really attracted a lot of attention, and that encouraged me to work on it.
Motorola Phones
Adam:
Richard kept tinkering with his database project on the side until he got a phone call from a tech giant.
Richard:
It was from Motorola. Back, 2002, 2001 when this happened, Motorola was one of the tech giants, and these days, the tech giants are Apple and Android and Google and Microsoft and Facebook, but back then the tech giants were things, people like AOL and Motorola and Nokia, so I got a phone call from some people at Motorola and they said, “Listen, we’re designing a new cell phone operating system and we want SQLite to be part of it. Can you support this for us?” Of course, I was real cool about it, said, “Oh, sure, sure. I can do that for you,” but they said, “Well, do you have any pricing information?” “Well, look, I tell you what, let’s have a call tomorrow and I’ll get back to you on that.”
Of course, inside, I was like, “What? You can make money with open source software? How does this work? How do I price this? I have no idea how to do this.” I scrambled around and came up with some pricing strategy. They wanted some enhancements to it so it could go in their phones, and I gave them a quote and at the time, I thought this was a quote for all the money in the world. It was just huge.
Adam:
What was it? Can you share?
Richard:
I think it was $80,000 or something like that. It was not very much money by today’s standards, but for me, that was everything, and I brought three of the guys that I’d worked with on to work on it, and we worked that project and that was sort of the beginning.
America Online Phones
Adam:
After Motorola, the next tech giant to reach out was America Online, who wanted Richard to visit their office and talk about a contract for some enhancements they needed.
Richard:
This was back when AOL was the world’s leading service provider. Maybe you’re too young to remember, Adam. We used to get …
Adam:
The CDs …
Richard:
yeah, CDs in the mail. Yeah, you know, the CD. Put this in your thing and $10 a month. They needed a database on that CD, and they had some ad hoc thing and they wanted to use SQLite on that. They had limited space, and so, “Hey, we need to put this on the CD.” I just put in this new feature which I thought was a really cool idea, that you could create a temp index on a real table, so a table is shared amongst processes, you can create a temporary index on it, and I thought that was a really cool idea. I was up there telling them about this, and mid-sentence, as I was getting ready to explain what this temp index, it suddenly occurred to me that if you have a temporary index, only one of th