

a * b + c
.Arenas, a.k.a. regions, are everywhere in modern language implementations.
One form of arenas is both super simple and surprisingly effective for compilers and compiler-like things.
Maybe because of its simplicity, I haven’t seen the basic technique in many compiler courses—or anywhere else in a CS curriculum for that matter.
This post is an introduction to the idea and its many virtues.
Arenas or regions mean many different things to different people, so I’m going to call the specific flavor I’m interested in here data structure flattening.
Flattening uses an arena that only holds one type, so it’s actually just a plain array, and you can use array indices where you would otherwise need pointers.
We’ll focus here on flattening abstract syntax trees (ASTs), but the idea applies to any pointer-laden data structure.
To learn about flattening, we’ll build a basic interpreter twice:
first the normal way and then the flat way.
Follow along with the code in this repository, where you can compare and contrast the two branches.
The key thing to notice is that the changes are pretty small,
but we’ll see that they make a microbenchmark go 2.4× faster.
Besides performance, flattening also brings some ergonomics advantages that I’ll outline.
A Normal AST
Let’s start with the textbook way to represent an AST. Imagine the world’s simplest language of arithmetic expressions, where all you can do is apply the four basic binary arithmetic operators to literal integers. Some “programs” you can write in this language include 42
, 0 + 14 * 3
, and (100 - 16) / 2
.
Maybe the clearest way to write the AST for this language would be as an ML type declaration:
type binop = Add | Sub | Mul | Div
type expr = Binary of binop * expr * expr
| Literal of int
But for this post, we’ll use Rust instead. Here are the equivalent types in Rust:
enum BinOp { Add, Sub, Mul, Div }
enum Expr {
Binary(BinOp, Box<Expr>, Box<Expr>),
Literal(i64),
}
If you’re not a committed Rustacean, Box
may look a little weird, but that’s just Rust for “a plain ol’ pointer to an Expr
.” In C, we’d write Expr*
to mean morally the same thing; in Java or Python or OCaml, it would just be Expr
because everything is a reference by default.1
With the AST in hand, we can write all the textbook parts of a language implementation, like a parser, a pretty-printer, and an interpreter.
All of them are thoroughly unremarkable.
The whole interpreter is just one method on Expr
:
fn interp(&self) -> i64 {
match self {
Expr::Binary(op, lhs, rhs) => {
let lhs = lhs.interp();
let rhs = rhs.interp();
match op {
BinOp::Add => lhs.wrapping_add(rhs),
BinOp::Sub => lhs.wrapping_sub(rhs),
BinOp::Mul => lhs.wrapping_mul(rhs),
BinOp::Div => lhs.checked_div(rhs).unwrap_or(0),
}
}
Expr::Literal(num) => *num,
}
}
My language has keep-on-truckin’ semantics; every expression eventually evaluates to an i64
, even if it’s not the number you wanted.2
For extra credit, I also wrote a little random program generator. It’s also not all that interesting to look at; it just uses a recursively-increasing probability of generating a literal so it eventually terminates. Using fixed PRNG seeds, the random generator enables some easy microbenchmarking. By generating and then immediately evaluating an expression, we can measure the performance of AST manipulation without the I/O costs of parsing and pretty-printing.
You can check out the relevant repo and try it out:
$ echo '(29 * 3) - 9 * 5' | cargo run
$ cargo run gen_interp # Generate and immediately evaluate a random program.
Flattening the AST
The flattening idea has two pieces:
- Instead of allocating
Expr
objects willy-nilly on the heap, we’ll pack them into a single, contiguous array. - Instead of referring to children via pointers,
Exprs
will refer to their children using their indices in that array.

Let’s look back at the doodle from the top of the post.
We want to use a single Expr
array to hold all our AST nodes.
These nodes still need to point to each other; they’ll now do that by referring to “earlier” slots in that array.
Plain old integers will take the place of pointers.
If that plan sounds simple, it is—it’s probably even simpler than you’re thinking.
The main thing we need is an array of Expr
s.
I’ll use Rust’s newtype idiom to declare our arena type, ExprPool
, as a shorthand for an Expr
vector:
struct ExprPool(Vec<Expr>);
To keep things fancy, we’ll also give a name to the plain old integers we’ll use to index into an ExprPool
:
The idea is that, everywhere we previously used a pointer to an Expr
(i.e., Box
or sometimes &Expr
), we’ll use an ExprRef
instead.
ExprRef
s are just 32-bit unsigned integers, but by giving them this special name, we’ll avoid confusing them with other u32
s.
Most importantly, we need to change the definition of Expr
itself:
enum Expr {
- Binary(BinOp, Box, Box ),
+ Binary(BinOp, ExprRef, ExprRef),
Literal(i64),
}
Next, we need to add utilities to ExprPool
to create Expr
s (allocation) and look them up (dereferencing).
In my implementation, these little functions are called add
and get
, and their implementations are extremely boring.
To use them, we need to look over our code and find every place where we create new Expr
s or follow a pointer to an Expr
.
For example, our parse
function used to be a method on Expr
, but we’ll make it a method on ExprPool
instead:
-fn parse(tree: Pair) -> Self {
+fn parse(&mut self, tree: Pair) -> ExprRef {
And where we used to return a newly allocated Expr
directly, we’ll now wrap that in self.add()
to return an ExprRef
instead.
Here’s the match
case for constructing a literal expression:
Rule::number => {
let num = tree.as_str().parse().unwrap();
- Expr::Literal(num)
+ self.add(Expr::Literal(num))
}
Our interpreter gets the same treatment.
It also becomes an ExprPool
method, and we have to add self.get()
to go from an ExprRef
to an Expr
we can pattern-match on:
-fn interp(&self) -> i64 {
+fn interp(&self, expr: ExprRef) -> i64 {
- match self {
+ match self.get(expr) {
That’s about it.
I think it’s pretty cool how few changes are required—see for yourself in the complete diff.
You replace Box
with ExprRef
, insert add
and get
calls in the obvious places, and you’ve got a flattened version of your code.
Neat!
But Why?
Flattened ASTs come with a bunch of benefits.
The classic ones most people cite are all about performance:
- Locality.
Allocating normal pointer-basedExpr
s runs the risk of fragmentation.
FlattenedExpr
s are packed together in a contiguous region of memory, which is good for spatial locality.
Your data caches will work better becauseExpr
s are more likely to share a cache line,
and even simple prefetchers will do a better job of predicting whichExpr
s to load before you need them.
A sufficiently smart memory allocator might achieve the same thing, especially if you allocate the whole AST up front and never add to it, but using a dense array removes all uncertainty. - Smaller references.
Normal data structures use pointers for references; on modern architectures, those are always 64 bits.
After flattening, you can use smaller integers—if you’re pretty sure you’ll never need more than 4,294,967,295 AST nodes,
you can get by with 32-bit references, like we did in our example.
That’s a 50% space savings for all your references, which could amount to a substantial overall memory reduction in pointer-heavy data structures like ASTs.
Smaller memory footprints mean less bandwidth pressure and even better spatial locality.
And you might save even more if you can get away with 16- or even 8-bit references for especially small data structures. - Cheap allocation.
In flatland, there is no need for a call tomalloc