
As part of this post, I will cover the research paper for Bitcask and do a code walkthrough of an implementation that I wrote using Java. Papers like these are concise enough to give a high-level idea of the technology and at the same time provide the required pieces that can be used to build a working implementation. This kind of approach has helped me to go down another layer in terms of understanding and I hope you too will gain something useful out of this post related to the internal workings of a storage engine.
Bitcask originates from a NoSQL key-value database known as Riak. Each node in a Riak cluster uses a pluggable in-memory key-value storage. Few of the goals that this key-value based storage aims to achieve are:
- Low latency for read and write operation
- Ability to recover from crash and recover with minimal latency
- Easy to understand data format
- Easy to backup data contents available in storage
Bitcask ends up achieving above requirements and ends up with a solution that is easy to understand and implement. Let us dive into the core design of Bitcask. In parallel, we will also look into the code implementation of major components that form Bitcask.
Key-Value Store Operations
From the data format’s perspective, Bitcask is very easy to understand. Consider Bitcask as a directory in your file-system. All the data resides in this directory and only one process is allowed to write contents into the directory at a time. There is one active
file at a time to which data is being appended. So processing a write request is as simple as appending a record entry to this file. As you are not updating contents of file and just appending contents to it, the write request is processed with minimal latency.
Once the size of active
file reaches a certain threshold, a new active
file is created in the same directory and the previous active
file is now considered immutable. Contents can be read from this file but there are no modifications performed on this file. On the code level it looks something like as below:

Another instance where a new file gets created is when the database server is shut down and then re-established. So in other words, as soon as the database server goes down, the active
file is transferred to immutable status and when the server comes back up, it starts working with a new file marked as active
.
The FileRecord
that we ar