Make formal verification and provably correct software practical and mainstream by peanutcrisis

Share This Article

Sed ut perspiciatis unde.

This repo is still very early and rough, it’s mostly just notes, speculative writing, and exploratory theorem proving. Most of the files in this repo are just “mad scribblings” that I haven’t refined enough to actually stand by!

If you prefer video, this presentation talks about the core ideas that make formal verification and Magmide possible, and the design goals and intentions of the project:

In this readme I give a broad overview and answer a few possible questions. Enjoy!

The goal of this project is to: create a programming language capable of making formal verification and provably correct software practical and mainstream. The language and its surrounding education/tooling ecosystem should provide a foundation strong enough to create verified software for any system or environment.

Software is an increasingly critical component of our society, underpinning almost everything we do. It’s also extremely vulnerable and unreliable. Software vulnerabilities and errors have likely caused humanity trillions of dollars in damage, social harm, waste, and lost growth opportunity in the digital age (it seems clear Tony Hoare’s estimate is way too conservative, especially if you include more than null errors).

What would it look like if it was both possible and tractable for working software engineers to build and deploy software that was provably correct? Using proof assistant languages such as Coq it’s possible to define logical assertions as code, and then write proofs of those assertions that can be automatically checked for consistency and correctness. Systems like this are extremely powerful, but have only been suited for niche academic applications until the fairly recent invention of separation logic.

Separation logic isn’t a tool, but a paradigm for making logical assertions about mutable and destructible state. The Rust ownership system was directly inspired by separation logic, which shows us that it really can be used to unlock revolutionary levels of productivity and excitement. Separation logic makes it possible to verify things about practical imperative code, rather than simply outlawing mutation and side effects as is done in functional languages.

However Rust only exposes a simplified subset of separation logic, rather than exposing the full power of the paradigm. The Iris separation logic was recently created by a team of academics to fully verify the correctness of the Rust type system and several core implementations that use unsafe. Iris is a fully powered separation logic, making it uniquely capable of verifying the kind of complex, concurrent, arbitrarily flexible assertions that could be implied by practical Rust code, even those that use unsafe. Iris could be used to do the same for any other practical and realistic language.

Isn’t that amazing?! A system that can prove completely and eternally that a use of unsafe isn’t actually unsafe??!! You’d think the entire Rust and systems programming community would be over the moon!!?

But as always happens with academia, it’s only being used to write papers rather than creating tools to apply it to real software. All the existing uses of Iris perform the proofs “on the side” using transcribed syntax tree versions of the target code rather than directly reading the original source. And although the papers are more approachable than most academic papers, they’re still academic papers, and so basically no working engineers have even heard of any of this.

This is why I’m building Magmide, which is intended to be to both Coq and LLVM what Rust has been to C. There are quite a few proof languages capable of proving logical assertions in code, but none exist that are specifically designed to be used by working engineers to build real imperative programs. None have placed a full separation logic, particularly one as powerful as Iris, at the heart of their design, but instead are overly dogmatic about the pure functional paradigm. None make it possible to type-check, prove assertions about, and compile practical programs all using one unified tool. And all existing proof languages are hopelessly mired in the obtuse and unapproachable fog of research debt created by the culture of academia. Even if formal verification is already capable of producing provably safe and secure code, it isn’t good enough if only professors have the time to gain the necessary expertise. We need to pull all this amazing knowledge out of the ivory tower and finally put it to work to make computing truly safe and robust.

I strongly believe a world with mainstream formal verification would not only see a significant improvement in magnitude of social good produced by software, but a significant improvement in kind of social good. In the same way that Rust gave engineers much more capability to safely compose pieces of software therefore enabling them to confidently build much more ambitious systems, a language that gives them the ability to automatically check arbitrary conditions will make safe composition and ambitious design arbitrarily easier to do correctly.

What kinds of ambitious software projects have been conceived but not pursued because getting them working would simply be too difficult? With machine checkable proofs in many more hands could we finally build truly secure operating systems, trustless networks, or electronic voting methods? How many people could be making previously unimagined contributions to computer science, mathematics, and even other logical fields such as economics and philosophy if only they had approachable tools to do so? I speculate about some possibilities at the end of this readme.

Read posts/design-of-magmide.md or posts/comparisons-with-other-projects.md to more deeply understand how Magmide is designed and how it’s different than other projects.

This is a huge goal, and a language capable of achieving it must have all these qualities:

Capable of arbitrary logic

In order to really deliver the kind of truly transformative correctness guarantees that will inspire working engineers to learn and use a difficult new language, it doesn’t make sense to stop short and only give them an “easy mode” verification tool. It should be possible to formalize and attempt to prove any proposition humanity is capable of representing logically, not only those that a fully automated tool like an SMT solver can figure out. A language with full logical expressiveness and manual proofs can still use convenient automation as well, but the opposite isn’t true.

To achieve this goal, the language will be fully dependently typed and use the Calculus of Constructions much like Coq. I find Adam Chlipala’s “Why Coq?” arguments convincing in regard to this choice. Coq will also be used to bootstrap the first version of the compiler, allowing it to be self-hosting and even self-verifying using a minimally small trusted theory base. Read more about the design and bootstrapping plan in posts/design-of-magmide.md. The metacoq and “Coq Coq Correct!” projects have already done the work of formalizing and verifying Coq using Coq, so they will be very helpful.

It’s absolutely possible for mainstream engineers to learn and use these powerful logical concepts. The core ideas of formal verification (dependent types, proof objects, higher order logic, separation logic) aren’t actually that complicated. They just haven’t ever been properly explained because of research debt, and they weren’t even all that practical before separation logic and Iris. I’ve been working on better explanations in the (extremely rough and early) posts/intro-verification-logic-in-magmide.md and posts/coq-for-engineers.md.

Capable of bare metal performance

Software needs to perform well! Not all software has the same requirements, but often performance is intrinsically tied to correct execution. Very often the software that most importantly needs to be correct also most importantly needs to perform well. If the language is capable of truly bare metal performance, it can still choose to create easy abstractions that sacrifice performance where that makes sense.

To achieve this goal, the language will include in its core libraries a formalization of the basic principles of von Neumann computation, allowing users to specify the axiomatic assumptions of any software execution environment, from concrete instruction set architectures, to any abstract assembly language such as LLVM capable of compiling to many targets, and even up to operating system userlands or bytecode environments such as WebAssembly. Making it possible to specify software at this level of fidelity helps ensure it is aligned with reality and isn’t making unrealistic assumptions.

Because of separation logic and Iris, it is finally possible to verify code at this extremely granular level.

Gradually verifiable

Just because it’s possible to fully verify all code, doesn’t mean it should be required. It simply isn’t practical to try to completely rewrite a legacy system in order to verify it. Successful languages with goals of increased rigor such as Rust and Typescript strategically use concessions in the language such as unsafe and any to allow more rigorous code to coexist with legacy code as it’s incrementally replaced. The only problem is that these concessions introduce genuine soundness gaps into the language, and it’s often difficult or impossible to really understand how exposed your program is to these safety gaps.

We can get both practical incremental adoption and complete understanding of the current safety of our program by leveraging work done in the Iron obligation management logic built using Iris. We can use a concept of trackable effects to allow some safety conditions to be optional, so a project can successfully compile code without proving everything about it is safe, making fast iteration and incremental adoption possible.

Trackable effects will work by requiring a piece of some “correctness token” to be forever given up in order to perform a dangerous operation without justifying its safety with a proof. This would infect the violating code block with an effect type that will bubble up through any parent blocks. Defining effects in this way makes them completely composable resources rather than wrappers, meaning that they’re more flexible and powerful than existing effect systems. Systems like algebraic effects or effect monads could be implemented using this resource paradigm, but the opposite isn’t true.

If the trackable effect system is defined in a sufficiently generic way then custom trackable effects could be created, allowing different projects to introduce new kinds of safety and correctness tracking, such as ensuring asynchronous code doesn’t block the executor, or a web app doesn’t render raw untrusted input, or a server doesn’t leak secrets.

Even if a project chooses to ignore some effects, they’ll always know those effects are there, which means that other possible users of the project will know as well. Project teams could choose to fail compilation if their program isn’t memory safe or could panic, while others could tolerate some possible effects or write proofs to assert they only happen in certain well-defined circumstances. It would even be possible to create code that provably sandboxes an effect by ensuring it can’t be detected at any higher level if contained within the sandbox. With all these systems in place, we can finally have a genuinely secure software ecosystem!

Fully reusable

We can’t write all software in assembly language! Including first-class support for powerful metaprogramming, alongside a query-based compiler, will allow users of this language to build verified abstractions that “combine upward” into higher levels, while still allowing the possibility for those higher levels to “drop down” back into the lower levels. Being a proof assistant, these escape hatches don’t have to be unsafe, as higher level code can provide proofs to the lower level to justify its actions.

This ability to create fully verifiable higher level abstractions means we can create a “verification pyramid”, with excruciatingly verified software forming a foundation for a spectrum of software that decreases in importance and rigor. Not all software has the same constraints, and it would be dumb to to verify a recipe app as rigorously as a cryptography function. But even a recipe app would benefit from its foundations removing the need to worry about whole classes of safety and soundness conditions.

Magmide itself doesn’t have to achieve mainstream success to massively improve the quality of all downstream software, but merely some sub-language. Many engineers have never heard of LLVM, but they still implicitly rely on it every day. Magmide would seek to do the same. We don’t have to make formal verification fully mainstream, we just have to make it available for the handful of people willing to do the work. If a full theorem prover is sitting right below the high-level language you’re currently working in, you don’t have to bother with it most of the time, but you still have the option to do so when it makes sense.

The metaprogramming can of course also be used directly in the dependently typed language, allowing compile-time manipulation of proofs, functions, and data. Verified proof tactics, macros, and higher-level embedded programming languages are all possible. This is the layer where absolutely essential proof automation tactics similar to Coq’s auto or Adam Chlipala’s crush, or fast counter-example searchers such as quickcheck would be implemented.

Importantly, the language will be self-hosting, so metaprogramming functions will benefit from the same bare metal performance and full verifiability.

You can find rough notes about the current design thinking for the metaprogramming interface in posts/design-of-magmide.md.

Practical and ergonomic

My experience using languages like Coq has been extremely painful, and the interface is “more knife than handle”. I’ve been astounded how willing academics seem to be to use extremely clunky workflows and syntaxes just to avoid having to build better tools.

To achieve this goal, this project will learn heavily from cargo and other excellent projects. It should be possible to verify, interactively prove, query, compile, and run any Magmide code with a single tool.

An important design choice that will likely make the tool easier to understand and use is the “Logic/Host split” (discussed deeply in posts/design-of-magmide.md) that clearly separates “imaginary” pure functional code from the “real” imperative computational code.

Taught effectively

Working engineers are resource constrained and don’t have years of free time to wade through arcane and disconnected academic papers. Academics aren’t incentivized to properly explain and expose their amazing work, and a massive amount of research debt has accrued in many fields, including formal verification.

To achieve this goal, this project will enshrine the following values in regard to teaching materials:

Speak to a person who wants to get something done and not a review committee evaluating academic merit.
Put concrete examples front and center.
Point the audience toward truly necessary prerequisites rather than assuming shared knowledge.
Prefer graspable human words to represent ideas, never use opaque and unsearchable non-ascii symbols, and only use symbolic notations when it’s both truly useful and properly explained.
Prioritize the hard work of finding clear and distilled explanations.

How do we design something capable of meeting all the above design constraints? By choosing three pillar features that nicely interlock with each other and then “maxing them out” to the most powerful versions of themselves, we can build a tool capable of achieving our goal with the smallest amount of effort possible. These pillars hold up the entire design:

Maxed out in logical power by using the strongest type theory we can find. Without this the design wouldn’t be able to handle lots of interesting/useful/necessary problems, and couldn’t be adopted by many teams. It wouldn’t be able to act as a true foundation for verified computing. It’s important to note that this one design feature enables all the other features involving formal logic, such as the ability to formalize bare metal computation at a high level of abstractness and the use of trackable effects.
Maxed out in computational power by self-hosting in a bare metal language. If the language were interpreted or garbage collected then it would always perform worse than is strictly

Make formal verification and provably correct software practical and mainstream by peanutcrisis

Make formal verification and provably correct software practical and mainstream by peanutcrisis

Share This Article

Newsletter

Capable of arbitrary logic

Capable of bare metal performance

Gradually verifiable

Fully reusable

Practical and ergonomic

Taught effectively

HackTech

Leave a comment Cancel reply

Editor's Choice

Make formal verification and provably correct software practical and mainstream by peanutcrisis

Make formal verification and provably correct software practical and mainstream by peanutcrisis

Share This Article

Newsletter

Capable of arbitrary logic

Capable of bare metal performance

Gradually verifiable

Fully reusable

Practical and ergonomic

Taught effectively

HackTech

Leave a comment Cancel reply

Editor's Choice

Sign Up to Our Newsletter