Aria Beingessner
March 19th, 2022
I think about unsafe pointers in Rust a lot.
I literally wrote the book on unsafe Rust. And the book on pointers in Rust. And redesigned the Rust’s pointer APIs. And designed the standard library’s abstraction for unsafe heap-allocated buffers. And maintain the alternative Vec layout.
I think about unsafe pointers in Rust a lot, and I absolutely hate them.
Don’t get me wrong, I think all of the above work has made them better but they are still deeply flawed. Actually they’ve gotten a lot worse. Not because the APIs have changed, but because when I was working on this stuff we had too naive of an understanding of how pointers should work. Others have done a lot of great work to expand this understanding, and now the flaws are all the more glaring.
This article is broken up into 3 parts: conceptual background, problems with the current design, and proposed solutions.
This section can be skipped entirely if you know everything about computers.
Aliasing is a very important concept in compilers and language semantics. At a high-level, it is the study of the observability of modifications to memory. We call it aliasing because the problem is very easy until you can refer to a piece of memory in more than one way. Pointers are just nicknames for memory.
The primary function of alias analysis is as a model for when the compiler can semantically cache memory accesses. This can either mean assuming a value in memory hasn’t been modified or assuming a write to memory isn’t necessary. This is exceptionally important because essentially all program state is semantically in memory. It is literally impossible for a general purpose programming language that does anything on the behalf of the programmer to allow arbitrary reads and writes to memory.
As a hopefully extremely obvious example that we can all agree on, the compiler should be able to assume that the following program will pass 1
to println!
:
let mut x = 0;
let mut y = 1;
x = 2;
println!("{y}");
When we talk about alias analysis we usually jump immediately to pointers because that’s the hard part but like, the fact that this has deterministic behaviour is part of your aliasing model! Variables are semantically unaliased until you actually take a reference to them!
This is actually a foundational assumption for putting things in registers, because putting something in a register is caching it. If a compiler can’t decide it’s ok to put values in general purpose registers or spill them to the stack, it’s an assembler at best. We would like to build languages that are higher-level than an assembler!
This is all well and good until we introduce pointers and have to start answering harder questions. For instance, in the following function, can we assume that input
and output
refer to different regions of memory?
fn compute(input: &u32, output: &mut u32) {
if *input > 10 {
*output = 1;
}
if *input > 5 {
*output *= 2;
}
}
If we can, then the compiler is free to rewrite it as follows:
fn compute(input: &u32, output: &mut u32) {
let cached_input = *input;
if cached_input > 10 {
*output = 2;
} else if cached_input > 5 {
*output *= 2;
}
}
If they do point to the same memory, then the write *output = 1
and the read *input > 5
would alias. When we perform (potentially) aliasing accesses, the compiler has to conservatively load and store from memory as much as the source code implies.
Now it’s often clumsy to talk about accesses aliasing, so we usually talk about pointers aliasing as a shorthand. So one would reasonably say that input
and output
are aliased. The reason that the actual model is in terms of accesses and not pointers is because that’s the thing that we care about. We don’t actually care if you pass in two pointers that alias but only use one. Similarly we don’t actually care if two pointers alias but are only used for reads – one read can’t observably effect the other read (this assumption is why memory mapped hardware has to use volatile
)
This is also why Rust has such a distinct schism between “unique mutable” references (&mut
) and “shared immutable” references (&
). It’s fine to make as many copies as you want of pointers that have readonly access, but if you want to actually mutate some memory it’s really important that it isn’t aliased!
(You may notice that this is a simplified model full of lies, if you would like less lies, read my extremely detailed discussion of Stacked Borrows.)
Now with all that said, I will use the following shorthands:
- memory is anonymous if the programmer cannot refer to it by name or pointer.
- memory is unaliased if there is currently only one way to refer to it.
Anonymous values are in some sense “completely under the control of the compiler” and can therefore be freely assumed to be unaliased and trusted/modified by the compiler. Unaliased memory cannot be “randomly” modified by something seemingly “unrelated” (we’ll get to what that means in the next section).
Languages can have stricter or weaker aliasing models. A stricter model allows the compiler to do more optimizations but puts heavier restrictions on what the programmer is allowed to do within the confines of the language. Here are some common rules, in vaguely increasing strictness:
- Callee-saved values pushed to the stack are anonymous (return pointer, frame pointer).
- “Scratch” values the compiler spills to the stack are anonymous.
- A newly declared variable is unaliased until a reference is taken to it.
- The memory returned by
malloc
is unaliased. - Fields of a struct do not alias eachother (bitfields are made of sadness).
- Padding bytes are vaguely anonymous (messy because of memcpy/memset/unions/punning).
- Immutable variables are functionally unaliased in that they can never change values.
- In Rust,
&mut
is unaliased (Stacked Borrows). - In C(++),
T*
andU*
cannot alias ifT!=U
and neither ischar
(Strict Aliasing).
(I cannot emphasize enough how shorthanded all of this is, the devil is extremely in the details and formally specifying these things in this subject of untold numbers of PhD theses. I am not trying to write a PhD thesis right now. Unless you literally work on a C/C++ Standard Committee or are named Ralf Jung I will not be accepting your Umm Actually’s on these definitions and terms.)
1.2 Alias Analysis and Pointer Provenance
Ok so all of that aliasing stuff is all well and good, but as soon as you take a pointer to something, or copy a pointer, or offset a pointer… that all goes out the window, right? Like you have to assume anything can be aliased by anything else?
No! Aliasing rules are some of the most foundational parts of the language’s semantics and optimizations. If you run afoul of the language’s aliasing rules you have Undefined Behaviour and the miscompilations can be extremely brutal!
But yes, once you start faffing around with pointers things get a lot harder for the compiler, memory model, and programmer. To make aliasing a useful notion once pointers start getting thrown around, memory models very quickly find the need to define two concepts:
- Allocations
- Pointer Provenance
Allocations abstractly describe things like individual variables and the heap allocations. A freshly created allocation (variable decl, malloc) is always brought into the world unaliased and therefore acts like a sandbox with One True Name – there is no way to access the memory in the sandbox except through the One True Name (that isn’t Undefined Behaviour).
Pointer Provenance describes the way “permission to access an allocation’s sandbox” can be delegated from the One True Name by deriving a new pointer from it or something derived from it. The process of tracking this “chain of custody” from the One True Name to all of its derived pointers is Pointer Provenance.
From a formal memory model perspective, all accesses to an allocation must have provenance tracking back to that allocation. If pointer provenance isn’t satisfied, then that means the programmer broke out of the sandbox or pulled a pointer from the aether that happened to point into some random sandbox. Either way, everything is chaos and nothing makes sense anymore if that’s allowed.
From a compiler optimization perspective, tracking provenance allows the compiler to prove that two accesses don’t alias. If it ever loses track of provenance (i.e. if a pointer is passed to an opaque function) then it just conservatively assumes they do alias. But if the compiler neverloses track of all the derived pointers to an allocation, it can have perfect aliasing information and do Some Good Codegen.
This is the fundamental trick compilers apply to all basically impossible problems: have a simple analysis that can answer your query with “YES”, “NO”, or “MAYBE” and then convert “MAYBE” to “YES” or “NO” based on whichever one is safer. Do these two accesses MAYBE alias? Then YES they alias. Problem Solved.
But once you get low-level enough the compiler needs you to help it out and actually follow some dang rules. This is why the llvm GetElementPointer (GEP) instruction which computes a pointer offset, is almost always emitted by compilers with the inbounds
keyword. The inbounds
keyword is basically “I promise this offset won’t break the pointer out of its allocation sandbox and completely trash aliasing and provenance”. Which like, yeah all of your pointer offsets should follow that rule!
Let’s go up a level and look at Rust: any time you do (*ptr).my_field
the compiler will emit GEP inbounds
. Have you ever wondered why the documentation for ptr::offset and friends is so weird and complicated? Because they lower to GEP inbounds
and need to follow its rules! ptr::wrapping_offset is just GEP
without the inbounds
. And even wrapping_offset
isn’t actually allowed to break provenance:
Compared to
offset
, this method basically delays the requirement of staying within the same allocated object:offset
is immediate Undefined Behavior when crossing object boundaries;wrapping_offset
produces a pointer but still leads to Undefined Behavior if a pointer is dereferenced when it is out-of-bounds of the object it is attached to
Mea culpa, I spent years calling CHERI completely unshippable vaporware! I was pretty confident, but I’ll eat my hat because ARM Morello actually built and shipped a full CHERI-based CPU. Congratulations to everyone who worked on it!
So what is CHERI? I’m not going to get into the nitty-gritty details but roughly speaking it’s a 128-bit architecture. Well actually it’s 129-bit. Well actually it’s 64-bit. Sorry what?
Ok so the whole Idea with CHERI is that it actually reifies and implements the “sandboxing” model from the previous section. Every pointer is tagged with extra metadata that the hardware maintains and validates. If you ever break out of your sandbox the hardware will catch it and the OS will presumably kill your process.
I don’t know the full details of the encoding or metadata, but the part we care abo