It may not seem like it, but the world dodged a big one on March 28, 2024. A software security hack was identified and shared with the public that was absolutely breathtaking not only for its potential for harm but for the method of its insertion into the software ecosystem. It cannot be said we dodged the big one because, as will be explained below, the nature of this attack points out a much larger vulnerability in the entire software ecosystem stemming from the world's over-reliance upon open-source software frameworks and the world's underfunding of support for those open-source frameworks. Leaving that gap unaddressed guarantees more security flaws leveraging these weak points will be encountered in the future.
The XZ Utils Vulnerability - In a Nutshell
The security vulnerability publicly identified on March 28, 2024 involved a module of code called XZ Utils ("utilities") built for Linux based operating systems which provides functions for performing data encryption / de-encryption on streams of data as a host reads or writes data from disk or relays it from one external source to another destination. The XZ module was maliciously altered to allow a remote third party to trigger the capture of data passing through the XZ module in unencrypted form and allow remote execution of code. Since the XZ module is used by many other programs running on a system including remote logins via SSH or file compression utilities use to pack files to send to remote systems, this hack served as a common lever to exploit against multiple tools on the host.
At the time of discovery, the hack was found to have only propagated into nightly "development builds" of the latest Linux kernels used in a handful of Linux distributions, including SuSE, Redhat, Fedora, Debian and ArchLinux. For the hack to be exploited, a host had to be running one of the tainted Linux distributions and the host would have to be reachable for inbound connections from the Internet. Luckily, most large corporations do NOT run the latest version of ANY system in production and MOST hosts operate within protected interior network segments which are NOT accessible from any random point on the Internet. As such, the nature of this hack did not immediately jeopardize the integrity of 100% of all Linux servers on the planet.
However...
The DESIGN of the hack and method used to inject it into builds of the module chosen for the attack have identified two huge problems literally no one in the software industry has considered. The XZ hack utilized a technical design that recursively obfuscated the malicious code being inserted and delayed pulling that altered code into final binaries until very late in the build process, helping to evade detection. More importantly, the developers of the hack exploited unique HUMAN vulnerabilities with the open-source model for maintaining code which are the result of extreme gambles mega corporations are taking by USING open source to save money and earn billions for investors without FUNDING staff to provide the peer review needed by open source code. As a result, the open source world is demonstrating "tragedy of the commons" dynamics as private parties attempt to leverage a community "good" while contributing nothing to its support and betting someone else is so they get a free ride.
The Technical Attack Design
The party that developed the XD hack did not inject the hack into XD by directly modifying the source code of the particular targeted module. That approach would have put the change in clear view of both the module's official "maintainer" and the larger pool ("community") of developers and users. Instead, injection of the hack was obfuscated by at least four layers of indirection.
- Bogus test files consisting of compressed data containing the actual "hacked" code were added to a separate /test directory of the entire module. Every file in a module's source is subject to version control and delta tracking but developers are accustomed to ignoring the content of test files.
- This bogus "test file" was actually structured to alternate between 1024 bytes of random characters and 1024 bytes of actual code.
- A script used to prepare the source code directory for a build was altered to uncompress this "test file", strip out the sub-blocks of random characters and write the remaining code to a separate file for execution later in the build.
- After compiling the original unaltered source code for the XZ module, a final step in the build tested for certain platform target flags (kernel version, x86_64 architecture and GCC compiler) then linked the XZ module being packaged into binary through the ALTERED file containing the malicious code.
That's a pretty ingenuous approach for evading detection. Mask the hacked source code as compressed test data outside the normal "source" directory of the module. Further mask that compressed, hacked code content by intermixing blocks of random text to make that "test file" LOOK like random data rather than code if anyone ever happened to look at it. Compile all of the original code of the module into the expected final building blocks. Pull in the hacked code at the very last stage of the build when linking the final executable binary.
As ingenuous as the TECHNICAL design might be, that technical ingenuity is not what is causing developers and security experts to freak out over this hack.
The Human Attack Design
As clever as the code-level hack design may have been, the party originating this hack still had to INJECT the layers of the hack into the XZ source repository for victims to begin downloading it and building it into other systems. Each code project may have dozens or hundreds of developers with rights to check in suggested modifications but every project has a much smaller number of developers with "maintainer" rights to control which changes actually make it into a final build. Most projects have at most a handful of maintainers. The XZ Utils project had a SINGLE maintainer.
The hackers began interacting with the XZ maintainer in early 2022, at first posing as a willing developer named "Jia Tan" eager to "help" catch up with minor bug fixes and enhancement changes. "Jia Tan's" first change was checked into the code base on February 2, 2022. On May 19, 2022, user "Dennis Ens" (no way to know if the name is real) posed a question on the email forum used to manage the project following up on a question posed one week earlier. In the resulting chain of emails, another user "Jigar Kumar" (again, no way to know if the name is real) chimed in, browbeating the maintainer and suggesting the maintainer role be handed over to someone else with more time to keep up with requests.
These emails went back and forth from May 19 to June 29, 2022 at which point the maintainer stated he might consider assigning maintainer rights to "Jia Tan" who had been working on bugs and minor enhancements since 2022 though it wasn't clear if he was versed in the entire build. Between June 29, 2022 and mid 2023, the maintainer DID cede some maintainer rights to "Jia Tan". On June 28, 2023, the first change was added to the code ostensibly in unit test code. On July 8, 2023, a second change was made to disable the logic of the first change when performing certain types of test builds. On February 16, 2024, the project was altered to configure its version control system to stop tracking any changes to the script file that performs the last minute linking substitution. Finally, on March 9, 2024, the obfuscated and compressed script was added as a "data file" in the /test portion of the build and release 5.6.0 was published representing the full hack. All maintained by "Jia Tan."
In essence, the party that created this hack found a useful module that would provide useful access to multiple critical functions on hosts, found that module was maintained by a SINGLE PERSON, began interacting with the maintainer as a "contributor" to build up trust over the course of two years, then began harassing the maintainer under multiple identities over a period of months to convince the maintainer to cede control to a contributor who was actually one of the bad guys. And the rest of the internet world dependent upon the reliability and security of this module paid ZERO attention to control of this software being ceded to someone whose actual identity was never confirmed.
Avoidance by Luck Rather Than Design
Given the technical nature of the hack and the human angle used to initially inject it into the supply chain, there were no steps in the process that provided assurance that this vulnerability would be found before it was introduced into the much larger universe of Linux distributions. In reality, it was PURE LUCK that this vulnerability was found at all.
The hack was found when Andres Fruend, an engineer in Germany working for Microsoft (yes, a MICROSOFT employee helped find a catastrophic bug with LINUX software...), was testing software associated with something completely unrelated to the XZ module and noticed the performance of a host running his software was not performing as expected. The engineer enabled additional monitoring on all processes running on his host and that information pointed out that incoming SSH connections were taking more CPU resources than normally expected. Knowing that the software being tested was itself NOT a heavy user of SSH connections in the first place, the developer reviewed the SSHD process of the host and found it was spending more time when making calls to a library named liblzma provided by the XZ Utils package. The developer then noticed the XZ Utils module running on his system was not unique to his Linux distro but had been altered in the original maintainer repository where it might be pulled into MANY different Linux distribution builds. The developer reviewed that code, analyzed where the new library module originated, then saw how that file was being created by the script extracted from uncompressing the "test file", then uncovered the hack.
Fruend happened to find this because one of his test machines was loaded with a "nightly build" image of one of the targeted Linux distributions. Had he not undertaken this investigation, that same 5.6.0 release could have been folded into the next "Generally Available" release builds of dozens of Linux distributions in two or three weeks, rapidly widening the scope of the vulnerability.
Keep in mind, the developer who found this...
- wasn't responsible for maintaining or testing XZ Utils
- found the hack in a nightly image of his Linux distribution and might have ignored it as a temporary bug common to nightly builds versus "Generally Available" stable builds
- happened to have enough insight to notice a problem with SSH
- happened to have enough skills to reverse engineer the behavior of XZ Utills
- happened to have enough skills to reverse engineer the build script for XZ Utils
- was diligent enough to spend the TIME to trace the entire chain to find the hack
In other words, Fruend's skillset is NOT typical of most developers using open source modules in their work. Most developers are not intimately familiar with the build scripts of every module they use. Most developers are NOT experts at monitoring low level performance statistics of software such as memory, CPU load, open file handles, etc. to know what's normal and abnormal. Most importantly, most developers are conditioned to assume that if they are pulling external modules into their build from a trusted repository and if a SHA-256 checksum of the downloaded package matches the published checksum from the owner of the module, that the module is "safe" and unaltered. As long as the module builds locally without error, the module will be assumed safe.
What's Wrong with This Model?
The current model for open-source software is the result of two different rationales that have influenced adherents for fifty years. The older rationale for open-source was more idealistic and started with an assumption that software was a nearly pure reflection of human thought and since humans benefit from an unrestricted sharing of ideas, humans should benefit from an unrestricted sharing of code as well. Sharing code FOR FREE reduces the cost of useful software, shares useful programming techniques to encourage the development of MORE useful software and creates a positive feedback loop that benefits everyone. For ease of reference, we'll call this the "hippy" rationale.
A newer rationale, not entirely in conflict with the first, evolved in parallel with the first that pitched open-source software as the best means to improve the QUALITY of software by allowing more people to look at frequently used code and improve it by identifying and fixing defects and by devising additional functionality and helping implement those new functions. This rationale will be labeled the "transparency" rationale for open source.
Adherents to the "transparency" rationale for open source prefer it to proprietary, closed-source code because they believe the POSSIBILITY of more eyes reviewing code that has been publicly shared discourages innocent but sloppy code from being included in the distribution and allows ALL changes to be placed under a collective microscope of a pool of developers much larger than the core contributors to spot unintentional or intentional bad code. With proprietary source, in contrast, It is generally difficult to analyze the intended and actual behavior of a compiled program written by a small set of developers as part of a corporate product without access to the source code. More importantly, declaring code open source allows other developers to use the code within other projects and alter the code as needed for those projects. This attracts MORE interest in the code and provides a strong incentive for more outsiders to continually review the state of the code watching out for defects, security flaws or outright hacks.
At first, the "transparency" rationale for open-source can seem more practical than the original "hippy" rationale. Professionals in the field generally feel less silly citing the "transparency" rationale for open source than the "hippy" rationale, especially when lobbying for use of open source with their bosses. But in a modern business setting, the transparency rationale can be equally delusional for developers and managers alike, and far more dangerous to actual code quality. Why?
The transparency rationale is built upon a series of assumptions that are almost never all true for any given open-source module:
- The thousands of developers USING an open source module are seldom as versed in the module's design or the language in which it is written as its maintainers and contributors. The idea that the average developer can review the average open source project and provide useful critiques of potential problems is a fantasy.
- Those developers in the user community who DO possess the expertise to provide a sanity check on the code's quality / security do not necessarily have the time or incentive to actually perform such reviews and communicate concerns to the maintainer. If the code works for the user's intended use case and compiles successfully, generally no user will raise questions about the module. Surely, someone ELSE checked it, right?
- Merely publishing code as open source does not guarantee even tens or hundreds of people will find value in the code to USE it or attempt to REVIEW it to identify defects or suspect functionality.
- Publishing code as open source DOES guarantee that bad actors will have access to the code. Bad actors may actually spend time reviewing the code looking for defects to be exploited but their goal isn't to help the maintainer fix the problem, their goal is to track how the module gets used in larger projects that might be running in a target such as a large corporation or government entity.
- The fact that thousands of open-source projects are managed in GitHub which has been mined as training data for ChatGPT makes it that much easier for bad actors to use information about a generic type of programming defect suitable for exploit to find open-source projects exhibiting that defect pattern to begin targeting a hack on those projects.
- Not all open source projects have multiple maintainers and contributors to act as security checks on any one contributor adding nefarious modifications. In fact, the number of other open source projects USING a particular open source module has absolutely ZERO correlation with the number of maintainers and contributors. Some of the most widely used and critical open source packages have a SINGLE maintainer. Like XZ Utils.
That last bullet is the key takeaway for this entire exploit. The XZ Utils package is used by dozens of other utilities that are bundled with EVERY distribution of Linux which run on tens of millions of hosts, yet the XZ Utils package had exactly ONE maintainer for years, handling all bug fix and feature enhancement requests. The party creating the hack EXPLICITLY targeted this module BECAUSE of its use by tools performing security critical functions on a host and BECAUSE the code was maintained by a single person.
This XZ exploint is NOT an argument for abandoning open source and reverting to proprietary code. This exploit IS a wakeup call to governments and mega corporations around the world that open source software is not free software, even if no money is paid for licensing to use it. Entities benefiting from open source need to internally fund more staff with appropriate skills or fund participation in shared forums for ensuring EVERY project embedded in every application or operating system has multiple, independent experts analyzing code integrity. Entities creating entire applications or operating systems using open source for commercial distribution must be accountable for confirming they have confirmed full names, addresses, contact information and government / corporate affiliation of every maintainer of every open source module included in their product. None of this has been happening prior to March 28, 2024.
WTH