In some domains of programming it’s common to want to write a data structure or algorithm that can work with elements of many different types, such as a generic list or a sorting algorithm that only needs a comparison function. Different programming languages have come up with all sorts of solutions to this problem: From just pointing people to existing general features that can be useful for the purpose (e.g C, Go) to generics systems so powerful they become Turing-complete (e.g. Rust, C++). In this post I’m going to take you on a tour of the generics systems in many different languages and how they are implemented. I’ll start from how languages without a special generics system like C solve the problem and then I’ll show how gradually adding extensions in different directions leads to the systems found in other languages.
One reason I think generics are an interesting case is that they’re a simple case of the general problem of metaprogramming: writing programs that can generate classes of other programs. As evidence I’ll describe how three different fully general metaprogramming methods can be seen as extensions from different directions in the space of generics systems: dynamic languages like Python, procedural macro systems like Template Haskell, and staged compilation like Zig and Terra.
Overview
I made a flow chart of all the systems I discuss to give you an overview of what this post will contain and how everything fits together:
The basic ideas
Let’s say we’re programming in a language without a generics system and we want to make a generic stack data structure which works for any data type. The problem is that each function and type definition we write only works for data that’s the same size, is copied the same way, and generally acts the same way.
Two ideas for how to get around this are to find a way to make all data types act the same way in our data structure, or to make multiple copies of our data structure with slight tweaks to deal with each data type the correct way. These two ideas form the basis of the two major classes of solutions to generics: “boxing” and “monomorphization”.
Boxing is where we put everything in uniform “boxes” so that they all act the same way. This is usually done by allocating things on the heap and just putting pointers in the data structure. We can make pointers to all different types act the same way so that the same code can deal with all data types! However this can come at the cost of extra memory allocation, dynamic lookups and cache misses. In C this corresponds to making your data structure store void*
pointers and just casting your data to and from void*
(allocating on the heap if the data isn’t already on the heap).
Monomorphization is where we copy the code multiple times for the different types of data we want to store. This way each instance of the code can directly use the size and methods of the data it is working with, without any dynamic lookups. This produces the fastest possible code, but comes at the cost of bloat in code size and compile times as the same code with minor tweaks is compiled many times. In C this corresponds to defining your entire data structure in a macro and calling it for each type you want to use it with.
Overall the tradeoff is basically that boxing leads to better compile times but can hurt runtime performance, whereas monomorphization will generate the fastest code but takes extra time to compile and optimize all the different generated instances. They also differ in how they can be extended: Extensions to boxing allow more dynamic behavior at runtime, while monomorphization is more flexible with how different instances of generic code can differ. It’s also worth noting that in some larger programs the performance advantage of monomorphization might be canceled out by the additional instruction cache misses from all the extra generated code.
Each of these schools of generics has many directions it can be extended in to add additional power or safety, and different languages have taken them in very interesting directions. Some languages like Rust and C# even provide both options!
Boxing
Let’s start with an example of the basic boxing approach in Go:
type Stack struct {
values []interface{}
}
func (this *Stack) Push(value interface{}) {
this.values = append(this.values, value)
}
func (this *Stack) Pop() interface{} {
x := this.values[len(this.values)-1]
this.values = this.values[:len(this.values)-1]
return x
}
Example languages that use basic boxing: C (void*
), Go (interface{}
), pre-generics Java (Object
), pre-generics Objective-C (id
)
Type-erased boxed generics
Here’s some problems with the basic boxing approach:
- Depending on the language we often need to cast values to and from the correct type every time we read or write to the data structure.
- Nothing stops us from putting elements of different types into the structure, which could allow bugs that manifest as runtime crashes.
A solution to both of these problems is to add generics functionality to the type system, while still using the basic boxing method exactly as before at runtime. This approach is often called type erasure, because the types in the generics system are “erased” and all become the same type (like Object
) under the hood.
Java and Objective-C both started out with basic boxing, and later added language features for generics with type erasure, even using the exact same collection types as before for compatibility, but with optional generic type parameters. See the following example from the Wikipedia article on Java Generics:
List v = new ArrayList();
v.add("test"); // A String that cannot be cast to an Integer
Integer i = (Integer)v.get(0); // Run time error
List<String> v = new ArrayList<String>();
v.add("test");
Integer i = v.get(0); // (type error) compilation-time error
Inferred boxed generics with a uniform representation
OCaml takes this idea even further with a uniform representation where there are no primitive types that require an additional boxing allocation (like int
needing to be turned into an Integer
to go in an ArrayList
in Java), because everything is either already boxed or represented by a pointer-sized integer, so everything is one machine word. However when the garbage collector looks at data stored in generic structures it needs to tell pointers from integers, so integers are tagged using a 1 bit in a place where valid aligned pointers never have a 1 bit, leaving only 31 or 63 bits of range.
OCaml also has a type inference system so you can write a function and the compiler will infer the most generic type possible if you don’t annotate it, which can lead to functions that look like a dynamically typed language:
let first (head :: tail) = head
(* inferred type: 'a list -> 'a *)
The inferred type is read as “a function from a list of elements of type 'a
to something of type 'a
”. Which encodes the relation that the return type is the same as the list type but can be any type.
Interfaces
A different limitation in the basic boxing approach is that the boxed types are completely opaque. This is fine for data structures like a stack, but things like a generic sorting function need some extra functionality, like a type-specific comparison function. There’s a number of different ways of both implementing this at runtime and exposing this in the language, which are technically different axes and you can implement the same language using multiple of these approaches. However, it seems like different language features mostly lend themselves towards being implemented a certain way, and then language extensions take advantage of the strengths of the chosen implementation. This means there’s mostly two families of languages based around the different runtime approaches: vtables and dictionary passing.
Interface vtables
If we want to expose type-specific functions while sticking with the boxing strategy of a uniform way of working with everything, we can just make sure that there’s a uniform way to find the type-specific function we want from an object. This approach is called using “vtables” (shortened from “virtual method tables” but nobody uses the full name) and how it is implemented is that at offset zero in every object in the generic structure is a pointer to some tables of function pointers with a consistent layout. These tables allow the generic code to look up a pointer to the type-specific functions in the same way for every type by indexing certain pointers at fixed offsets.
This is how interface
types are implemented in Go and dyn
trait
objects are implemented in Rust. When you cast a type to an interface type for something it implements, it creates a wrapper that contains a pointer to the original object and a pointer to a vtable of the type-specific functions for that interface. However this requires an extra layer of pointer indirection and a different layout, which is why sorting in Go uses an interface for the container with a Swap method instead of taking a slice of a Comparable
interface, because it would require allocating an entire new slice of the interface types and then it would only sort that and not the original slice!
Object-oriented programming
Object oriented programming is a language feature that makes good use of the power of vtables. Instead of having separate interface objects that contain the vtables, object-oriented languages like Java just have a vtable po