$ cargo build --example basic --features usdt-probes [...snip...] error: could not compile `dropshot` Caused by: process didn't exit successfully: `rustc [...snip...]` (signal: 11, SIGSEGV: invalid memory reference)
Achievement unlocked: rustc
segfault.
Stack trace
fffffc7fce3fcbc0 librustc_driver-77cef3efbfa7284c.so`llvm::BranchProbabilityInfo::computeEestimateBlockWeight(llvm::Function const&, llvm::DominatorTree*, llvm::PostDominatorTree*)+0xd84()
fffffc7fce3fd370 librustc_driver-77cef3efbfa7284c.so`llvm::BranchProbabilityInfo::calculate(llvm::Function const&, llvm::LoopInfo const&, llvm::TargetLibraryInfo const*, llvm::DominatorTree*, llvm::PostDominatorTree*)+0x131()
fffffc7fce3fd3c0 librustc_driver-77cef3efbfa7284c.so`llvm::BranchProbabilityAnalysis::run(llvm::Function&, llvm::AnalysisManager&)+0x134()
fffffc7fce3fd5f0 librustc_driver-77cef3efbfa7284c.so`llvm::detail::AnalysisPassModel::Invalidator>::run(llvm::Function&, llvm::AnalysisManager &)+0x2f()
fffffc7fce3fd6a0 librustc_driver-77cef3efbfa7284c.so`llvm::AnalysisManager ::getResultImpl(llvm::AnalysisKey*, llvm::Function&)+0x2de()
fffffc7fce3fd6d0 librustc_driver-77cef3efbfa7284c.so`llvm::BlockFrequencyAnalysis::run(llvm::Function&, llvm::AnalysisManager &)+0x3f()
fffffc7fce3fd710 librustc_driver-77cef3efbfa7284c.so`llvm::detail::AnalysisPassModel::Invalidator>::run(llvm::Function&, llvm::AnalysisManager &)+0x26()
fffffc7fce3fd7c0 librustc_driver-77cef3efbfa7284c.so`llvm::AnalysisManager ::getResultImpl(llvm::AnalysisKey*, llvm::Function&)+0x2de()
fffffc7fce3fdc30 librustc_driver-77cef3efbfa7284c.so`llvm::AlwaysInlinerPass::run(llvm::Module&, llvm::AnalysisManager&)+0xa2c()
fffffc7fce3fdc50 librustc_driver-77cef3efbfa7284c.so`llvm::detail::PassModel>::run(llvm::Module&, llvm::AnalysisManager &)+0x15()
fffffc7fce3fddc0 librustc_driver-77cef3efbfa7284c.so`llvm::PassManager>::run(llvm::Module&, llvm::AnalysisManager &)+0x4b5()
fffffc7fce3ff170 librustc_driver-77cef3efbfa7284c.so`LLVMRustOptimizeWithNewPassManager+0x7f2()
fffffc7fce3ff3a0 librustc_driver-77cef3efbfa7284c.so`rustc_codegen_llvm::back::write::optimize_with_new_llvm_pass_manager+0x372()
fffffc7fce3ff5b0 librustc_driver-77cef3efbfa7284c.so`rustc_codegen_llvm::back::write::optimize+0x388()
fffffc7fce3ff900 librustc_driver-77cef3efbfa7284c.so`rustc_codegen_ssa::back::write::execute_work_item::+0x1f3()
fffffc7fce3ffdb0 librustc_driver-77cef3efbfa7284c.so`std::sys_common::backtrace::__rust_begin_short_backtrace::< ::spawn_named_thread::{closure#0}, ()>::{closure#0}, ()>+0xf7()
fffffc7fce3fff60 librustc_driver-77cef3efbfa7284c.so`<::spawn_unchecked_<::spawn_named_thread::{closure#0}, ()>::{closure#0}, ()>::{closure#1} as core::ops::function::FnOnce<()>>::call_once::{shim:vtable#0}+0xa9()
fffffc7fce3fffb0 libstd-ef15f81a900bedf3.so`std::sys::unix::thread::Thread::new::thread_start::h24133bfe318082b5+0x27()
fffffc7fce3fffe0 libc.so.1`_thrp_setup+0x6c(fffffc7fed642280)
fffffc7fce3ffff0 libc.so.1`_lwp_start()
Ok, so we’re faulting somewhere in LLVM it seems like. From Cliff’s initial investigation:
Anyway, yeah, something about the CFG construction there is generating either an empty basic block or a basic block ending in an unexpected type of instruction (something that is not an LLVM IR terminator instruction) and triggering https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/IR/BasicBlock.h#L121
First order of business then is to just check if the IR is valid. LLVM has a pass to do just that and we can ask rustc
to run it first by passing -Z verify-llvm-ir=yes
(note we need to switch to nightly to use -Z
flags):
$ RUSTFLAGS="-Z verify-llvm-ir=yes" cargo +nightly build --example basic --features usdt-probes
Haha, nope:
Basic Block in function '_ZN8dropshot6server24http_request_handle_wrap28_$u7b$$u7b$closure$u7d$$u7d$17h503b14ddd4edd1deE' does not have terminator!
label %bb24
LLVM ERROR: Broken module found, compilation aborted!
# Demangle w/ rustfilt (c++filt works well enough too)
# Single quotes important to not misinterpret $ as shell vars!
$ rustfilt '_ZN8dropshot6server24http_request_handle_wrap28_$u7b$$u7b$closure$u7d$$u7d$17h503b14ddd4edd1deE'
dropshot::server::http_request_handle_wrap::{{closure}}
The IR generated for a closure in dropshot::server::http_request_handle_wrap
is invalid—some basic block is missing a terminator.
Ok, is it rustc generating the bad IR directly or the result of some transformation pass miscompiling it?
But first, let’s cheat and just get the final failing rustc
command so we don’t need to rebuild all the deps anytime we change RUSTFLAGS
. Re-running the failing cargo
command should just output the failing rustc
invocation:
$ cargo +nightly build --example basic --features usdt-probes Compiling dropshot v0.6.1-dev (/src/dropshot/dropshot) error: could not compile `dropshot` Caused by: process didn't exit successfully: `rustc [...snip...]` (signal: 11, SIGSEGV: invalid memory reference)
From this point we can just directly run the rustc
command as outputted with a few modifications:
- add
+nightly
otherwise therustc
wrapper will attempt to use the rust version mentioned inrust-toolchain.toml
- remove the
--error-format=json
and--json=...
flags for human-readable output - add
-Z verify-llvm-ir=yes
- change the
--emit
argument to--emit=llvm-ir
because that should be enough to trigger the issue and we’d like to look at the IR later
Stick this in a simple shell script to easily modify it and run it; call it repro.sh
. Verify it still fails as expected:
$ ./repro.sh Basic Block in function '_ZN8dropshot6server24http_request_handle_wrap28_$u7b$$u7b$closure$u7d$$u7d$17h503b14ddd4edd1deE' does not have terminator! label %bb24 LLVM ERROR: Broken module found, compilation aborted!
Now back to figuring out where this invalid IR is coming from. Even though we’re doing a debug build, there are still some LLVM passes that get run. So if we want to verify the IR that rustc
directly generated, we need to make sure no LLVM passes are run at all (aside from the verify
pass itself). The way to do that is via -C no-prepopulate-passes
so let’s edit our repro.sh
and run it again:
Ok rustc
has been proven innocent. Looks like some LLVM pass generates invalid IR which really shouldn’t happen!
Well, now what? Let’s try to find out what pass is responsible!
Our first attempt is by asking LLVM to print the IR after each pass—maybe we’ll get lucky and see the offending pass last. We do this by modifying repro.sh
again:
- remove
-C no-prepopulate-passes
&-Z verify-llvm-ir=yes
- add
-C llvm-args=--print-after-all
to print the IR after every pass - add
-C codegen-units=1 -Z no-parallel-llvm
to make the output a bit more readable
Alas, this doesn’t go the way we want as we get the same segfault as before without any of the actual output we wanted :(
Ok, new attempt. Let’s skip rustc
and see if we can just invoke the LLVM machinery directly via opt
. For that, let’s first install it:
$ rustup component add --toolchain nightly llvm-tools-preview
It is not the most discoverable because it just gets plopped somewhere into rustc
‘s sysroot directory:
$ OPT=$(find $(rustc +nightly --print sysroot) -name opt)
We also need the actual IR to pass to opt
so let’s go back and modify our repro.sh
to only pass -C no-prepopulate-passes
. We should find our initial rustc
generated IR. It’s also worth remove the -C debuginfo=2
to make the IR a bit smaller:
$ ls ./target/debug/examples/basic*.ll ./target/debug/examples/basic-5f5f0491fbb5b7d3.ll
Let’s try something simple first and just run the IR through opt
without any flags as a smoke test:
$ $OPT ./target/debug/examples/basic-5f5f0491fbb5b7d3.ll opt: ./target/debug/examples/basic-5f5f0491fbb5b7d3.ll:425470:1: error: expected instruction opcode bb25:
()>
Read More