Rebuilding the world’s most popular data structure.
The simple array is probably the most popular data structure in programming. It’s a straightforward yet powerful tool — it lets you represent an ordered list of items with fast random access. It doesn’t matter if you’re looking for index 1 or index 500 — with the array, both accesses take the same amount of time.
If you’ve been a developer for a while, you probably use arrays on a daily basis without thinking too much about them. But have you ever wondered how arrays actually work?
In this post, I want to dive into the technical details of the array and figure out how you might invent the array yourself. We’re going to start with a simple array that only supports numbers, then later build that up to the flexible, dynamic array we all know and love from JavaScript.
Let’s get started!
Memory API
To implement our array, we’ll use our computer’s memory API (or a super simplified version of it). This API has four methods:
-
get(address: number)
— returns the value at a specific memory address. -
set(address: number, value: number)
— sets the given address to the given value. -
allocate(bytes: number)
— allocates a given number of bytes, returning a pointer to the first allocated byte. -
free(pointer: Pointer)
— frees the memory location pointed to by the pointer.
If you’ve written code in a lower level language before, some of these methods might be familiar to you; if not, that’s okay too — we’ll go over how everything works in this section.
Reading, Writing, and Addresses
As a starting point, you can think of memory as itself a super long array, where each element of the array represents a single byte:
1
2
3
You can refer to specific bytes in memory using their address — just like how you can use an index to refer to specific elements in an array.
Memory Allocation
One key restriction behind our computer’s memory system is that you can’t freely read and write to any old address in memory*. In fact, when you begin, you can’t read and write to any address at all!
*And for a good reason too — your computer memory is shared between the hundreds of different programs running in parallel. Imagine if programs can change memory that’s being used by other programs!
To get past this restriction, you have to ask for space in memory by allocating it. To allocate space, you use the allocate
function, passing in the specific number of bytes that you need:
const block = Mem.allocate(bytes: 2)
The allocate
call returns the address of the first byte of the allocated block. In this case, the call returns the address 0
because our 2-byte block begins at address 0
.
Great! Now that we have our allocated slice of memory, we can read and write from it as we please:
const block = Memory.allocate(2)
1 / 4
Notice how we weren’t allowed to write to block + 2
– that’s because we only allocated 2 bytes, so the third byte, block + 2
, is out of bounds!
Freeing Up Space
Finally, once you’re done with the data and don’t need it anymore, you can free up that space so the computer can use the memory for other things:
const pointer = Mem.allocate(/* num of bytes */ 12)
// do stuff
Mem.free(pointer)
One thing you might notice from this allocate-free workflow is that you end up not referencing addresses directly. Instead, these addresses live inside variables, and all you need to do is move them around. Overall, the typical memory management workflow looks like this:
// Allocate
const block = Mem.allocate(3)
// Use
Mem.set(block, 'a')
Mem.set(block + 1, 'b')
console.log(Mem.get(block)) // prints 'a'
// Free
Mem.free(block)
1 / 7
As a quick summary:
- Memory, for our purposes, is a long array of bytes where each byte has an associated address.
- To use memory, you need to allocate it first; this allocation process returns the address of the first byte of the allocated block.
- Once allocated, you can freely read and write to that memory location.
- When you’re done, you can free the allocated space so it can be used elsewhere.
Building the Array
Now that we know a bit about memory and how it works, let’s get on with our array! We’ll start with a few assumptions to make our lives easier:
-
The array has a fixed length (i.e. it cannot grow in size), and
-
The array can only contain numbers.
These assumptions may seem really restrictive, but don’t worry — we’ll ease them as we move on.
Allocating Space
The first thing that we need to do is allocate space for our array. But, how much space?
const data = Mem.allocate(/* num of bytes */ ???)
Thankfully, our assumptions let us determine the size of our array ahead of time. Since our array is a fixed length and can only contain numbers, the total number of bytes we need to allocate is:
total # of bytes = length * (# of bytes for 1 number)
For the number of bytes for 1 number, we’ll