- Author
-
- Name
- Ben Johnson
- Social Media
- View Twitter Profile

We’re Fly.io. We run apps for our users on hardware we host around the world. This post isn’t about our platform. Rather, it’s an elaborate plot to get you to write some code just for the hell of it.
In the field of computer science, the industry is represented by two separate yet equally important groups: the software developers who build Rails applications and mobile games, and the academics who write theory papers about why the problems those apps try to solve are NP-hard. This is a story about both.
Distributed systems span the practical-academic divide. Reading a stack of MIT PhD dissertations may be a good Friday night, but it won’t equip you for debugging a multi-service outage at 2am. That requires real-world experience.
Likewise, building a fleet of microservices won’t give you the conceptual tools to gracefully & safely handle failure. Many failure scenarios are rare. They don’t show up in unit tests. But they’re devastating when they do show up. Nailing down the theory gives you a fighting chance at designing a correct system in the first place.
The practical and academic tracks seldom converge. To fix this, we teamed up with Kyle Kingsbury, author of Jepsen, to develop a series of distributed systems challenges that combine real code with the academic rigor of Jepsen’s verification system.
We call these challenges the Gossip Glomers.
What the f$#* is a Glomer?
It’s an elaborate pun about the CAP theorem.
How It Works
You know Kyle Kingsbury from his “Call Me Maybe” blog posts that eviscerate distributed databases. You may also have known about Jepsen, the Clojure-based open-source tooling Kyle uses to conduct these analyses. Well, Kyle also wrote another tool on top of Jepsen called Maelstrom.
Maelstrom runs toy distributed systems on a simulated network. It easily runs on a laptop. Kyle uses it to teach distributed systems. We all thought it’d be neat to build a series of challenges that would teach people around the Interne