
Cheating the Reaper in Go by ingve
Even though I am a C++ programmer at heart, Go fascinates me for none of the reasons you think. Go has made several interesting design decisions:
-
It has virtually no Undefined Behavior1.
-
It has very simple GC semantics that they’re mostly stuck with due to design decisions in the surface language.
These things mean that despite Go having a GC, it’s possible to do manual memory management in pure Go and in cooperation with the GC (although without any help from the runtime
package). To demonstrate this, we will be building an untyped, garbage-collected arena abstraction in Go which relies on several GC implementation details.
I would never play this kind of game in Rust or C++, because LLVM is extremely intelligent and able to find all kinds of ways to break you over the course of frequent compiler upgrades. On the other hand, although Go does not promise any compatibility across versions for code that imports unsafe
, in practice, two forces work against Go doing this:
-
Go does not attempt to define what is and isn’t allowed:
unsafe
lacks any operational semantics. -
Go prioritizes not breaking the ecosystem; this allows to assume that Hyrum’s Law will protect certain observable behaviors of the runtime, from which we may infer what can or cannot break easily.
This is in contrast to a high-performance native compiler like LLVM, which has a carefully defined boundary around all UB, allowing them to arbitrarily break programs that cross it (mostly) without fear of breaking the ecosystem.
So, let’s dive in and cheat death.
What Are We Building?
Our goal is to build an arena, which is a data structure for efficient allocation of memory that has the same lifetime. This reduces pressure on the general-purpose allocator by only requesting memory in large chunks and then freeing it all at once.
For a comparison in Go, consider the following program:
package main
import "fmt"
func main() {
var s []int
for i := range 1000 {
prev := cap(s)
s = append(s, i)
if cap(s) != prev {
fmt.Println(cap(s))
}
}
}
This program will print successive powers of 2: this is because append
is implemented approximately like so:
func append[S ~[]T, T any](a, b S) S {
// If needed, grow the allocation.
if cap(a) - len(a) < len
8 Comments
silisili
I've been doing some performance tuning in Go lately to really squeak performance, and ended up with a very similar arena design except using byte slices for buf and chunks instead of unsafe pointers. I think I tried that too and it wasn't any faster and a whole lot uglier, but I'll have to double check before saying that with 100% confidence.
A couple other easy wins –
if you start with a small slice and find some payloads append large amounts, write your own append that preemptively is more aggressive in cap bumping before calling the builtin append.
unsafe.String is rather new and great for passing strings out of byte slices without allocating. Just read the warnings carefully and understand what you're doing.
mholt
Off topic, but I love the minimap on the side — for pages where I might be jumping around the content (long, technical articles, to refer back to something I read earlier but forgot) — how can I get that on my site? Way cool.
kristianp
Just a quick meta note. This article is really lengthy, I don't have time to read this level of detail for the background. For example the "Mark and Sweep" section takes up more than 4 pages on my laptop screen. That section starts more than 5 pages into the article. Is this the result of having AI help to write sections, and as a result making it too comprehensive? It's easy to generate content, but the editing decisions to keep the important parts haven't been made. I just want to know the part about the Arena allocator, I don't need a tutorial on garbage collection as well.
foundry27
tl;dr for anyone who may be put off by the article length:
OP built an arena allocator in Go using unsafe to speed allocator operations up, especially for cases when you're allocating a bunch of stuff that you know lives and dies together. The main issue they ran into is that Go's GC needs to know the layout of your data (specifically, where pointers are) to work correctly, and if you just allocate raw bytes with unsafe.Pointer, the GC might mistakenly free things pointed to from your arena because it can't see those pointers properly. But to make it work even with pointers (as long as they point to other stuff in the same arena), you keep the whole arena alive if any part of it is still referenced. That means (1) keeping a slice (chunks) pointing to all the big memory blocks the arena got from the system, and (2) using reflect.StructOf to create new types for these blocks that include an extra pointer field at the end (pointing back to the Arena). So if the GC finds any pointer into a chunk, it’ll also find the back-pointer, therefore mark the arena as alive, and therefore keep the chunks slice alive. Then they get into a bunch of really interesting optimizations to remove various internal checks and and write barriers using funky techniques you might not've seen before
mkhattab
> Go prioritizes not breaking the ecosystem; this allows to assume that Hyrum’s Law will protect certain observable behaviors of the runtime, from which we may infer what can or cannot break easily.
If this assertion is correct, then effectively Go as a language is an evolutionary dead end. Not sure if I would Go fascinating in this case.
curtisszmania
[dead]
fmstephe
This article is a fun read.
If you enjoyed this, or if you need more control over some memory allocations in Go, please have a look at this package I wrote. I would love to have some feedback or have someone else use it.
https://github.com/fmstephe/memorymanager
It bypasses the GC altogether by allocating its own memory separately from the runtime. It also disallows pointer types in allocations, but replaces them with a Reference[T] type, which offers the same functionality. Freeing memory is manual though – so you can't rely on anything being garbage collected.
These custom allocators in Go tend to be arena's intended to support groups of allocations which live and die together. But the offheap package was intended to build large long-lived datastructures with zero garbage collection cost. Things like large in-memory caches or databases.
0xCafeBabee
Interesting stuff! For folks building off-heap or arena-style allocators in Go—how do you usually test or benchmark memory safety and GC interactions in practice?