Pre-Build Stage Security: Securing Your Supply Chain Before a Single Line Compiles

Watch the Lesson

Attackers do not wait for your code to compile. They strike at the supply chain—your dependencies, your base images, your repositories—before your build ever starts. Pre-build security determines what code even gets the chance to execute.

graph TB    Source["Source Code<br/>Git Repository"]    Deps["Dependencies<br/>npm, PyPI<br/>Maven, etc"]    Base["Base Images<br/>Docker Hub<br/>registries"]    Source -->|Verify| SourceCheck["Check commits<br/>Verify authors<br/>Scan code"]    Deps -->|Verify| DepsCheck["Check versions<br/>Verify hashes<br/>Scan packages"]    Base -->|Verify| BaseCheck["Verify digest<br/>Check for CVEs<br/>Scan layers"]    SourceCheck --> Combined["Pre-build<br/>Artifacts"]    DepsCheck --> Combined    BaseCheck --> Combined    Combined -->|All verified| Build["Build Can Start<br/>✅ Supply chain secure"]    Combined -->|Something wrong| Halt["Build Halted<br/>❌ Risk blocked"]    style Build fill:#ccffcc    style Halt fill:#ffcccc

The Supply Chain Before the Build

Your build starts with three ingredients: source code, dependencies, and base images. Each enters your pipeline from external sources. Each can be poisoned.

Your Git repository provides source code through commits, branches, and tags. Package registries (npm, PyPI, Maven, Go modules, and custom registries) supply dependencies. These two sources feed into lockfile state, which is recorded in files like package-lock.json, go.sum, and Pipfile.lock. The lockfile state is critical because it determines exactly which versions will be used.

From this lockfile state, two parallel validation processes occur. Source verification ensures commits are properly signed, maintainers are trustworthy, and code is properly reviewed. Dependency validation verifies cryptographic hashes, ensures versions are pinned rather than using ranges, and scans for license conflicts.

Both of these validation streams converge on base image selection. Here you verify the registry source is trusted, pin images by digest rather than tag, and scan images for known vulnerabilities.

Without controls at this stage, malicious code enters your build environment with full privileges.

Source Code Risks: When Your Repository Is Compromised

Your Git repository is an attack target. Attackers gain repository access through multiple vectors. Maintainer account takeover occurs when a developer's GitHub credentials are stolen, allowing an attacker to commit malicious code directly to main. Your next CI/CD run pulls the poisoned code. This happened to the PHP composer repository in 2021 when an attacker gained access to a maintainer's account and committed backdoored packages.

Malicious pull requests involve an attacker submitting a PR with legitimate-looking changes that hide malicious behavior. The change might be in a build script, in a CI/CD configuration file, or in a dependency update. Code reviewers miss it.

Forked repository attacks occur when you integrate code from a fork (a developer's personal GitHub account) that later proves to be compromised. The fork was never the source of truth; the upstream repository was. But your CI/CD pulled from the fork.

Compromised release tags happen when an attacker with write access to your repository pushes a tag that points to a commit containing malicious code. Semantic versioning v1.2.3 no longer means what you think it means.

Lack of commit signing is a serious risk. Unsigned commits do not prove authorship. An attacker can author a commit under anyone's name using git config user.name and git config user.email. Signed commits—using GPG keys or SSH keys—prove the committer's identity.

Source control hardening requires multiple complementary controls. Branch protection rules require pull request reviews before merging to main, enforce that status checks such as CI and security scanning pass before merging, and require that all commits be signed with cryptographic keys. Commit signing enforcement uses GPG keys or SSH keys to sign commits, with Git hosting configured to reject unsigned commits on protected branches. Access control limits write access to the repository, implements role-based access control to distinguish between different levels of privilege, and requires MFA for all accounts with write access. Audit logs are actively monitored for unexpected access, branch deletions, or permission changes that might indicate a compromise. The code review process requires at least one, preferably two, human code reviews before merging, and reviewers are trained to look for supply chain attacks hidden in build scripts and CI/CD configurations.

Dependency Risks: The Code You Did Not Write

You download code from registries every time you declare a dependency. That code runs with your privileges. Most vulnerabilities live in your dependencies, not your code.

Typosquatting attacks occur when an attacker publishes a package with a name similar to a popular library: lodash-es becomes loadsh-es. Developers mistype the package name in package.json. The malicious package installs and executes arbitrary code at install time (via postinstall scripts in npm).

Dependency confusion happens when your organization publishes an internal package named @acme/payment-processor to a private registry. An attacker publishes the same package name to the public npm registry with a higher version number. Dependency resolution prefers the higher version, and the public package installs instead.

Compromised popular packages result when an attacker gains control of a widely-used package through account takeover or by purchasing the account from a burnt-out maintainer. The attacker publishes a poisoned version. Millions of builds pull the malicious code. Real examples include event-stream (2018, 8+ million weekly downloads), ua-parser-js (2021, 8+ million weekly downloads), and colors.js and faker.js (2022, protest malware that flooded console output).

Unlocked dependency versions occur when a package.json declares lodash: ^4.17.0 instead of lodash: 4.17.21. The caret allows any version from 4.17.0 up to (but not including) 5.0.0. If a new version 4.17.50 is released and contains malicious code, your next build pulls it automatically.

Forgotten transitive dependencies are common. You declare a direct dependency on express. Express depends on body-parser. Body-parser depends on bytes. You never explicitly declared bytes, but you are vulnerable to attacks against bytes. Transitive dependency poisoning is common because maintainers of popular packages often have dozens of dependencies.

Unlocked base image versions occur when you declare FROM node:16 in your Dockerfile. Node releases version 16.13.2. When you rebuild, you get 16.13.2 automatically. If a later 16.x release (e.g., 16.50.0) is poisoned, your rebuild pulls it.

Dependency security controls include using lock files such as package-lock.json (npm), poetry.lock (Python), go.sum (Go), and Cargo.lock (Rust) which record the exact version and hash of every transitive dependency. Lock files must always be committed to version control and never deleted. Dependency scanning uses tools like Dependabot, Renovate, Snyk, or Trivy to scan dependencies daily for known CVEs, with the build failing if critical vulnerabilities are found. Hash verification checks the cryptographic hash of downloaded packages, which most registries publish. Version pinning specifies direct dependencies to exact versions in package.json or requirements.txt rather than using version ranges like ^ or ~. Dependency review processes examine changelogs and commit histories before upgrading, asking critical questions about why packages need certain permissions. Private registries for internal code ensure that internal packages are served from private registries, with build environments configured to authenticate to private registries and not fall back to public registries. License compliance scanning identifies licenses that conflict with the project's license, such as GPL or AGPL licenses.

Base Image Risks: The Foundation You Cannot See

A container base image is a pre-built filesystem with an OS, package manager, and common utilities. When you FROM ubuntu:20.04, you inherit everything in that image—including vulnerabilities.

Unverified images present a risk when you pull an image named docker.io/mycompany/trusted-base:latest from Docker Hub assuming it came from your organization. Docker Hub allows anyone to create an account and push images. Without explicit verification (image signing, digest pinning), you could pull a malicious image from an attacker's account.

Stale images with known CVEs are problematic. You build on ubuntu:20.04 and tag it :latest. Somewhere in that image, there is a vulnerable version of OpenSSL. Ubuntu released a patched version weeks ago, but Docker's :latest tag still points to the old, vulnerable image. Your CI/CD system rebuilds your app container using this stale base image, inheriting the OpenSSL vulnerability.

Bloated images contain tools you will never use: git, curl, vim, perl, compile toolchains. Each additional package increases your attack surface and your image size. An attacker exploiting a vulnerability in any of these tools can break out of your container.

Image tag mutability is a security issue. The tag :latest can point to different images over time. Today node:18-alpine points to Node.js 18.12.1. Tomorrow it might point to 18.13.0. Your builds are not reproducible. An attacker who controls the registry can push a poisoned image with the same tag, and your next build pulls it.

Base image security controls include digest pinning, where instead of using FROM node:18-alpine, you use FROM node:18-alpine@sha256:a1b2c3d4e5f6..., specifying the cryptographic hash of the image rather than relying on a mutable tag. The digest never changes, and if an attacker attempts to push a different image with the same tag, the build will fail because the digest will differ. Image scanning scans base images for known CVEs before using them, with tools such as Trivy, Clair, or registry-built-in scanning identifying vulnerable packages. Minimal base images use alpine, distroless, or other minimal base images, which have significantly smaller attack surfaces than full OS images like Ubuntu. Image signing uses Docker Content Trust or Notary to sign images cryptographically, with builds verifying signatures before pulling, preventing attackers from pushing unsigned poisoned images. Trusted registries configure builds to only pull images from specific approved registries, with public registries denied if not needed. Regular base image updates rebuild base images monthly or whenever upstream patches are released, keeping the image dependency tree current and patched.

Lockfile Integrity: Your Ground Truth

A lockfile records the exact version and hash of every dependency. package-lock.json, go.sum, Pipfile.lock, Cargo.lock—these are your ground truth.

Lockfile manipulation occurs when an attacker with repository access edits package-lock.json to point a trusted package name to a different registry or a higher version containing malicious code. Code reviewers might miss this because the change is not visible in the diff (it is a JSON file with many hashes).

Lockfile deletion happens when an attacker deletes the lockfile, forcing the build system to regenerate it from loose version constraints. The regeneration pulls the latest versions of all dependencies, including poisoned ones.

Lockfile controls include committing lockfiles to version control (always commit, never add them to .gitignore), careful lockfile review when reviewing PRs by examining lockfile diffs carefully with tools like npm diff or git diff --word-diff to highlight hash changes, lockfile format validation before starting the build to validate the lockfile's format and cryptographic structure, and dependency freeze on stable branches (e.g., release/1.2.3) to not allow automatic dependency updates (require explicit, reviewed updates).

Pre-Build Scanning: Detect Problems Before Build

Scanning dependencies before the build starts catches many supply chain attacks. Software composition analysis (SCA) scanning uses tools like Snyk, Dependabot, or open-source alternatives (OWASP Dependency-Check, Trivy) to scan your package.json, go.mod, requirements.txt, and lock files. They match declared versions against vulnerability databases (NVD, GitHub Security Advisories, vendor advisories) and alert you to known vulnerabilities.

Secret scanning scans code before it enters the repository to look for accidentally committed secrets: API keys, database passwords, private keys. Tools like git-secrets, TruffleHog, or GitHub's built-in secret scanning can run in pre-commit hooks or in CI/CD.

License compliance scanning flags license conflicts using SCA tools. If your project uses the MIT license but a dependency uses the GPL license, you have a compliance issue.

Source code static analysis uses tools like SonarQube, Fortify, or Checkmarx to scan your source code (not dependencies) for security bugs before compilation. These catch developer mistakes, not supply chain attacks, but they are part of the pre-build defense.

SLSA Framework and Pre-Build Security is important to understand. SLSA (Supply chain Levels for Software Artifacts) is a framework developed by Google and the Cloud Native Computing Foundation to prevent software supply chain attacks. Pre-build security maps to SLSA levels: SLSA Level 1 means provenance is available (not much security, but you have documentation of where artifacts came from). SLSA Level 2 requires you use a version control system, commits must be signed, and automated tests must run before releasing. SLSA Level 3 requires you use a hosted version control system with access controls (GitHub, GitLab), build must happen in a hosted CI/CD system, build must be hermetic (isolated from the internet), build must generate provenance in a standard format, and build artifacts must be signed. SLSA Level 4 requires everything from Level 3, plus the build system itself must be hardened, tested, and reviewed (supply chain compromises are nearly impossible).

Pre-build controls contribute to SLSA Level 2 (version control, signing, testing) and support Levels 3–4 (hermetic builds require provenance from pre-build, which requires version control and signing).

Pre-Build Policy Gates

Policy gates are automated rules that block the build if conditions are not met. Define policy gates such as: if any dependency has a critical CVE (CVSS greater than 9.0), fail the build. If any direct dependency is unlocked (uses version ranges), fail the build. If any commit to the main branch is unsigned, fail the build. If the base image digest is not pinned (tag-based), fail the build. If any source file is committed by an account without MFA enabled, fail the build.

Implement policy gates in CI/CD. Before the build starts, a policy engine checks these conditions. If any fail, the build stops and alerts go to the team.

From Pre-Build to Build Stage

Pre-build security establishes what code gets into your build. But pre-build controls do not protect the build itself. Once code enters the build, a new set of risks emerges.

See Build Stage Security: What Happens Inside the Build and Why It Matters in Pillar 2 for how to secure the compilation, linking, and image assembly process.

After the build completes and an image is created, runtime risks take over. See Runtime Stage Security: Protecting Containers After They're Running in Pillar 7 for how to execute containers safely and detect runtime attacks.

Key Tools and Implementations

Dependency Scanning tools include Dependabot (GitHub native), Renovate (multi-platform), Snyk (SCA + SAST), Trivy (open-source, fast), and OWASP Dependency-Check (open-source).

Secret Scanning tools include git-secrets (client-side), TruffleHog (multi-source), GitHub Advanced Security (native), and GitLab Secret Detection (native).

License Compliance tools include FOSSA (SaaS), Black Duck (enterprise), Synopsys (enterprise), and FOSSology (open-source).

Version Control Security is provided by GitHub, GitLab, and Bitbucket (branch protection, access control), Gitpolicies (policy enforcement), and Biscuit (distributed authorization).

Base Image Management tools include Docker Content Trust (image signing), Notary (distribution and signing), Cosign (container image signing, part of Sigstore), and admission controllers (enforce signed images in Kubernetes).

Pre-Build Security Checklist

Ensure all commits to protected branches are signed (GPG or SSH keys). Require all pull requests to have at least one code review before merging. Train code reviewers to spot supply chain attacks in CI/CD configurations and build scripts. Lock all dependencies in package-lock.json, go.sum, Pipfile.lock, or equivalent. Avoid version ranges (e.g., ^, ~, >=) in direct dependencies. Run dependency scanning (SCA) on every commit, failing the build if critical CVEs are found. Pin base image by digest, not by tag. Scan base image for CVEs before use. Run secret scanning on every commit and block commits with secrets. Enable license compliance scanning. Require approval before running CI/CD for pull requests from forks. Restrict repository access to team members with MFA enabled. Monitor audit logs for unexpected changes to permissions, branches, or tags.

Summary

Pre-build security is about controlling what code enters your pipeline. It operates on three fronts: source code, dependencies, and base images. Without pre-build controls, malicious code reaches your build with full privileges. With them—lockfiles, branch protection, dependency scanning, base image pinning, and commit signing—you reduce the risk of supply chain attacks to a manageable level.

The pre-build stage establishes a verified baseline. The build stage must then protect that baseline during compilation and image assembly. The runtime stage must then enforce the security properties of that image while it executes.

Next Steps: Secure the Full Supply Chain

Build securely — After pre-build controls, ensure the build itself is secure. Read Build Stage Security to understand secure compilation and image generation. Study The Continuous Trust Loop for automated rebuilds when vulnerabilities are detected. Explore The 11 Build Artifacts to see what a secure build generates (SBOMs, signatures, provenance).

Deploy safely — After building, deploy with verification. Read End-to-End Secure Deployment to deploy with scanning and verification. Study Dockerfile to YAML Migration to modernize your build specifications. Learn Helm Charts & Kubernetes to deploy at scale securely.

Operate securely — Maintain security in production. Read Upgrade & Patching Playbook to keep images patched. Study Supply Chain Disaster Recovery to protect your supply chain. Explore What is Supply Chain Security? for a full overview of the entire pipeline.