Aria Beingessner
November 7th, 2019
For those who don’t follow Swift’s development, ABI stability has been one of its most ambitious projects and possibly it’s defining feature, and it finally shipped in Swift 5. The result is something I find endlessly fascinating, because I think Swift has pushed the notion of ABI stability farther than any language without much compromise.
So I decided to write up a bunch of the interesting high-level details of Swift’s ABI. This is not a complete reference for Swift’s ABI, but rather an abstract look at its implementation strategy. If you really want to know exactly how it allocates registers or mangles names, look somewhere else.
Also for context on why I’m writing this, I’m just naturally inclined to compare the design of Swift to Rust, because those are the two languages I have helped develop. Also some folks like to complain that Rust doesn’t bother with ABI stability, and I think looking at how Swift does helps elucidate why that is.
This article is broken up into two sections: background and details. Feel free to skip to the details if you’re very comfortable with the problems inherent to producing a robust dynamically linked system interface.
If you aren’t comfortable with the basic concepts of type layouts, ABIs, and calling conventions, I recommend reading the article I wrote on the basic concepts of type layout and ABI as they pertain to Rust.
Also huge thanks to the Swift devs for answering all of the questions I had and correcting my misunderstandings!
1.1 Swift TLDR
I know a lot of people don’t really follow Swift, and it can be hard to understand what they’ve really accomplished without some context of what the language is like, so here’s a TL;DR of the language’s shape:
- Exists to replace Objective-C on Apple’s platforms, oriented at application development
- natively interoperates with Objective-C
- has actual classes and inheritance
- At a distance, very similar to Rust (but “higher-level”)
- interfaces, generics, closures, enums with payloads, unsafe escape hatch
- no lifetimes; Automatic Reference Counting (ARC) used for complex cases
- simple function-scoped mutable borrows (inout)
- Ahead-Of-Time (AOT) compiled
- An emphasis on “value semantics”
- structs/primitives (“values”) are “mutable xor shared”, stored inline
- collections implement value semantics by being Copy-On-Write (CoW) (using ARC)
- classes are mutably shared and boxed (using ARC), undermining value semantics (can even cause data races)
- An emphasis on things Just Working
- language may freely allocate to make things Work
- generic code may be polymorphically compiled
- fields may secretly be getter-setter pairs
- ARC and CoW can easily result in surprising performance cliffs
- tons of overloading and syntactic sugar
Don’t worry about fully understanding all of these, we’ll dig into the really important ones and their implications as we go on.
1.2 What Is ABI Stability and Dynamic Linking
When the Swift developers talk about “ABI Stability” they have exactly one thing in mind: they want native system APIs for MacOS and iOS to be written in Swift, and for you to dynamically link to them. This includes dynamically linking to a single system-wide copy of the Swift Standard Library.
Ok so what’s dynamic linking? For our purposes it’s a system where you can compile an application against some abstract description of an interface without providing an actual implementation of it. This produces an application that on its own will not work properly, as part of its implementation is missing.
To run properly, it must tell the system’s dynamic linker about all of the interfaces it needs implementations for, which we call dynamic libraries (dylibs). Assuming everything goes right, those implementations get hooked up to the application and everything Just Works.
Dynamic linking is very important to system APIs because it’s what allows the system’s implementation to be updated without also rebuilding all the applications that run on it. The applications don’t care about what implementation they get, as long as it conforms to the interface they were built against.
It can also significantly reduce a system’s memory footprint by making every application share the same implementation of a library (Apple cares about this a lot on its mobile devices).
Since Swift is AOT compiled, the application and the dylib both have to make a bunch of assumptions on how to communicate with the other side long before they’re linked together. These assumptions are what we call ABI (an Application’s Binary Interface), and since it needs to be consistent over a long period of time, that ABI better be stable.
So dynamic linking is our goal, and ABI stability is just a means to that end.
For our purposes, an ABI can be regarded as 3 things:
If you can define these details and never break them, you have a stable ABI, and dynamic linking can be performed. (Ignoring trivial cases where both the dylib and application were built together and ABI stability is irrelevant.)
Now to be clear, ABI stability isn’t technically a property of a programming language. It’s really a property of a system and its toolchain. To understand this, let’s look at history’s greatest champion of ABI stability and dynamic linking: C.
All the major OSes make use of C for their dynamically linked system APIs. From this we can conclude that C “has” a stable ABI. But here’s the catch: if you compile some C code for dynamic linking on Ubuntu, that compiled artifact won’t work on MacOS or Windows. Heck, even if you compile it for 64-bit Windows it won’t work on 32-bit Windows!
Why? Because ABI is something defined by the platform. It’s not even something that necessarily needs to be documented. The platform vendor can just require you to use a particular compiler toolchain that happens to implement their stable ABI.
(As it turns out, this is actually the reality of Swift’s Stabilized ABIs on Apple platforms. They’re not actually properly documented, xcode just implements it and the devs will do their best not to break it. They’re not opposed to documenting it, it’s just a lot of work and shipping was understandably higher-priority. Thankfully I don’t really care about the details, or the difference between the ABIs on MacOS and iOS, or implementations other than Apple’s, so I can keep saying “Swift’s ABI” and it won’t be a problem.)
But if that’s the case, why don’t platform vendors provide stable ABIs for lots of other languages? Well it turns out that the language isn’t completely irrelevant here. Although ABI isn’t “part” of C itself, it is relatively friendly to the concept. Many other languages aren’t.
To understand why C is friendly to ABI stability, let’s look at its much less friendly big brother, C++.
Templated C++ functions cannot have their implementations dynamically linked. If I provide you with a system header that provides the following declaration, you simply can’t use it:
template
bool process(T value);
This is because it has no symbol. C++ templates are monomorphically compiled, which is a fancy way of saying that the way to use them is to copy-paste the implementation with all the templates replaced with a particular value.
So if I want to call process
, I need to have the implementation available to copy-paste it with int
replacing T
. Needing to have the implementation available at compile-time completely undermines the concept of dynamic linking.
Now perhaps the platform could make a promise that it has precompiled several monomorphic instances, so say symbols for process
and process
are available. You could make that work, but then the function wouldn’t really be meaningfully templated anymore, as only those two explicitly blessed substitutions would be valid.
There would be little difference from simply providing a header containing:
bool process(int value);
bool process(bool value);
Now a header could just include the template’s implementation, but what that would really be guaranteeing is that that particular implementation will always be valid. Future versions of the header could introduce new implementations, but a robust system would have to assume applications could using either, or perhaps even both at the same time.
This is no different from a C macro or inline
function, but I think it’s fair to say that templates are a little more important in C++.
For comparison, most platforms provide a dynamically linked version of the C standard library, and everyone uses it. On the other hand, C++’s standard library isn’t very useful to dynamically link to; it’s literally called the Standard Template Library!
In spite of this issue (and many others), C++ can be dynamically linked and used in an ABI-stable way! It’s just that it ends up looking a lot more like a C interface due to the limitations.
Idiomatic Rust is similarly hostile to dynamic linking (it also uses monomorphization), and so an ABI-stable Rust would also end up only really supporting C-like interfaces. Rust has largely just embraced that fact, focusing its attention on other concerns.
1.3 Swift’s Stable ABI
I have now made some seemingly contradictory claims:
- Swift has similar features to Rust
- Rust’s features make it hostile to dynamic linking
- Swift is great at dynamic linking
The secret lies in where the two languages diverge: dynamism. Rust is a very static and explicit language, reflecting the sensibilities of its developers and early adopters. Swift’s developers preferred a much more dynamic and implicit design, and so that’s what they made.
As it turns out, hiding implementation details and doing more work at runtime is really friendly to dynamic linking. Who’d’ve thought dynamic linking was dynamic?
But what’s really interesting about Swift is the ways it’s not dynamic.
It’s actually fairly trivial to dynamically link a system where all the implementation details are hidden behind uniformity and dynamism. In the extreme case, we could make a system where everything is an opaque pointer and there’s only one function that just sends things strings containing commands. Such a system would have a very simple ABI!
And indeed, in the 90’s there was a big push in this direction with Microsoft embracing COM and Apple embracing Objective-C as ways to build system interfaces with simple and robust ABIs.
But Swift didn’t do this. Swift tries its hardest to generate code comparable to what you would expect from Rust or C++, and how it accomplishes that is what makes its ABI so interesting.
It’s worth noting that the Swift devs disagree with the Rust and C++ codegen orthodoxy in one major way: they care much more about code sizes (as in the amount of executable code produced). More specifically, they care a lot more about making efficient usage of the cpu’s instruction cache, because they believe it’s better for system-wide power usage. Apple championing this concern makes a lot of sense, given their suite of battery-powered devices.
It’s harder for third party developers to care about this, as they will naturally only control some small part of the software running on a device, and typical benchmarking strategies don’t really capture “this change made your application run faster but is making some background services less responsive and hurting battery life”. Hence C++ and Rust inevitably pushing towards “more code, more fast”.
This is all to say that some things which seem like compromises made for ABI stability’s sake are genuinely just regarded as desirable.
I never got any great concrete numbers on this concern from the Swift or Foundation folks, would definitely love to see some! Waves at the Apple employees reading this.
1.4 Resilience and Library Evolution
The Swift developers cover this topic fairly well in their documentation. I’ll just be giving a simplified vers