Supply Chain Threats 101
The recent vulnerability found in the xz library puts the spotlight on a topic that often receives less interest than it should: threats on the Supply Chain. As usual, the first thing to do when we don’t address something in the best way is to start from the beginning and try to understand the real issue. In this article I will not talk about the xz’s vulnerability, but about what it represents.
What we intend with “Supply chain”
When we talk about the supply chain in software development, we are referring to all the processes and components that contribute to delivering the software application. It’s clearly not a physical movement of goods, but rather the flow of information, code, libraries, and tools:
- Code: the core of the software, consisting of custom code written to address business requirements;
- Dependencies: all the external components like libraries, plugins, and open-source code that our software relies on to function;
- Tools: the tools and processes used to automate building, testing, deploying, and monitoring the software. This also includes all the tools used to manage the project, or to enable communication between developers;
- People: of course, developers, engineers, testers, and other professionals who work together to design, build, and maintain the software are part of the whole supply chain;
- Documentation: informations. Clear and concise documentation ensures everyone involved understands the code, dependencies, and processes.
In this scenario, having a very broad set of tools, the attack base is also equally broad. Consequently, the weak points of our development process are very numerous.
Supply chain Threats
As I mentioned, knowing the possible threats we can face, we can then act to mitigate and prevent attacks. Standard classification and description of these threats are offered by Supply-chain Levels for Software Artifacts, as known as SLSA (read “salsa”). SLSA is a collaborative security framework, led by a vendor-neutral steering group committed (the list includes Cloud Native Computing Foundation, The Linux Foundation, Google, Intel) to improving the security ecosystem for everyone, and is part of the Open Source Security Foundation.
SLSA is a set of incrementally adoptable guidelines for supply chain security. It includes common definitions, checklists, best practices, and a lot of suggestions to improve the overall security of our software. These guidelines define supply chain threats by grouping them in three areas and by indicating them in letters, from A to H, as depicted below.
The three areas are:
- Source threats, where we can find threats related to the source code;
- Build threats, where the problem can be in the build or in the package phase of the development process;
- Dependency threats, where we can have issues in our dependencies.
Let’s now get into details for each of these areas.
Source threats
In the Source threats area, we find all those situations where the source code is somehow compromised:
- A – Submission of unauthorized code: this happens, for example, when the code is submitted bypassing a code review. By “bypassing”, we intend anyway to obtain that the malicious code pass the review step, even, for example, gaining the reviewer credentials and using them to log into the system and approve the unauthorized changes;
- B – Source repository is compromised: this is the case where the attack compromises the source control system itself is compromised. You can find a known incident of this type reported here: the attacker compromised the PHP self-hosted git server, being potentially able to change the source code arbitrarily;
- C – Build from code that is modified after the actual code review. An attacker could easily achieve this, for example, after compromising the source repository, making any mitigation of threat A meaningless.
Build threats
After the threats on the source code, come the ones on the build and deploy phase. Note that the attacker, at this stage, may already have compromised the source code.
- E – Compromise of the build process: this happens when the build platform is compromised. Note that this could be not directly related to the build platform itself. In one of my previous experiences, I found that an attacker had compromised a blogging platform by accessing a vulnerable e-learning platform that I was hosting on the same web server. The attacker then compromised executables at OS level, resulting in any call to usually harmless commands led to data leaks;
- F – Upload of a modified package: the attacker gains access to the storage system where packages are stored, and can then upload a malicious version of a package. Similar to when the attacker submits a malicious code and approves it, the attack can directly address the final package. Again, it is very common that the root cause for this are leaked credentials;
- G – Compromise of the package registry: similar to letter E, but this happens when the attack compromises the package registry platform. In 2008, a team of research from the University of Arizona ran mirrors for several popular package registries. The team obtained that the systems using those package registries were actively pulling from the mirrors. An attacker could have used this approach to serve malicious packages;
- H – Use of a compromised package: at this stage, malicious code or package is available to be used, and when it happens, the attack is already underway. A use case is the Browserify typosquatting attack: the attacker created a package that imitated the legitimate package, using a name very similar to the original one. This is very difficult to catch without using appropriate mitigation practices.
Dependency threats
Last, but not least, there are dependencies:
- D – Compromising dependency: when one of the previous threats is not correctly addressed and the malicious package is being used, it may be a dependency for another system, leading to an attack on that system by compromising a component that is thought to be trusted. This is the case of xz: the attacker used the threat A to inject a malicious code. The code could activate a malicious package, injected using the threat F. When the compromised package was then released, it could have been used by anyone as a dependency.
Additional threats by Google
In addition to the threats that SLSA lists as possible sources of Supply chain vulnerabilities, Google also considers additional attack vectors, that I think are very common and important:
- 1 – Writing insecure code: how many times? If I think back to when I started writing code, I can recall a lot of times when I’ve written insecure code, unintentionally. Training is definitely important, in this case, but we can also add systems that perform checks for us;
- 2 – Compromise the deployment process: similar to threat E, but this time it is the whole deployment mechanism that is compromised. For example, a compromised system could inject a malicious sidecar container into one of our pods;
- 3 – Deployment of compromised or non-compliant software: as for threat H, this is the final stage, when the attack takes effect if we have a compromised deployment process;
- 4 – Vulnerabilities and misconfigurations in running software: in some cases, it’s the configuration that is malicious or vulnerable, and can be used as an attack vector. One example? Oh well, how many times do we set the user to a non-root user in our Dockerfiles? Yes, we should do it, and it is actually a best practice.
Next steps: mitigation
We’ve seen which threats we can face when we have to deal with Supply chain( that is, basically, always). Also, we’ve seen some examples of real scenarios for these threats. And what now? How do we address the threats, how do we mitigate them?
SLSA provides a set of use cases, principles and suggested mitigations for each of the threats. In the following articles, I will talk about them, also providing reference implementations on Google Cloud Platform.