The Central Tension
Two philosophies dominate container image design. The subtractive approach, known as strip-down, starts with a general-purpose OS image such as Ubuntu, CentOS, or Debian—often 500 MB to 1+ GB—and identifies which packages are unnecessary before removing them, hoping to have eliminated enough. This approach works, but it builds on an assumption of trust in the entire base operating system. When you remove the wrong package, your application breaks. When you leave the wrong package in place, you've added unnecessary attack surface.
The alternative is the reconstructive approach, or source-built method. This strategy starts from verified source code. You download the minimal set of packages needed from trusted repositories, compile them together, and assemble them into an image. Nothing is included unless explicitly required. While more work upfront, this approach guarantees provenance and minimal surface area.
Most organizations use some combination of both strategies. This guide explains what each approach offers, where it fails, and when to choose each one.
The Strip-Down Philosophy: Why It Seems Appealing
Strip-down is the intuitive approach that most teams have likely used dozens of times:
# Start with UbuntuFROM ubuntu:22.04 # Install Python and your app dependenciesRUN apt-get update && apt-get install -y \ python3 \ python3-pip \ curl \ git # Copy your appCOPY app.py /app/WORKDIR /appCMD ["python3", "app.py"] # Now optimize: remove build toolsRUN apt-get remove -y git curl # "We don't need these at runtime"RUN apt-get autoremove -y # "Remove orphaned dependencies"RUN apt-get clean # "Clean package cache"The result is smaller than the base Ubuntu image—perhaps 150-200 MB instead of 500+ MB. You've removed obvious unnecessary components.
Why Strip-Down Feels Right
Strip-down feels right for several reasons. It's reversible; if you remove something and the application breaks, you can simply re-add it. It's familiar because it mirrors how teams clean up their own laptops by removing software you don't use. The approach enables incremental progress by optimizing an existing image without rewriting its entire build process. Additionally, vendor images work out-of-the-box since official base images are maintained and relatively secure.
Where Strip-Down Fails
But strip-down has a fatal flaw: you cannot know what you do not know is there.
Case Study 1: The Hidden Exploit Chain
Consider a company that builds an application on ubuntu:22.04 and removes some packages with apt-get remove, feeling confident they have "optimized" it. They scan the image and find zero critical CVEs. However, the base image still includes the entire CUPS printing subsystem (cups, libcups, cups-filters), the resolver library needed for DNS lookups but often over-privileged, and system utilities that are rarely used like man-db and doc-db.
In 2024, suppose a vulnerability is discovered in cups-filters that allows arbitrary code execution. The company's application does not use CUPS at all, but it exists in the image because Ubuntu includes it by default. The company must then either wait for Ubuntu to patch the issue (typically 3-7 days), remove cups manually and rebuild (1-2 hours plus regression testing), or live with the vulnerability (unacceptable). A source-built image would not have CUPS because it was never included in the first place.
Case Study 2: Transitive Dependency Hell
When you install Python and pip to deploy your Flask application, Python depends on openssl for HTTPS connections. However, Ubuntu's openssl package also installs Perl (for some build scripts), which depends on gdbm (GNU database), which depends on further components (recursive dependencies). You end up with 50+ packages in the image, many of which you will never run. One of those indirect dependencies has a vulnerability that your team does not know about because they are not in your package dependency list (requirements.txt or poetry.lock). Your vulnerability scanner may miss it because it sees the package on the filesystem but not in your declared dependencies.
A source-built image would only include openssl and its true dependencies, not the entire perl ecosystem.
Case Study 3: Inherited Trust
Ubuntu's image is maintained by Canonical and is generally secure. However, when you docker pull ubuntu:22.04, you trust Canonical's security practices, Ubuntu's dependency resolution, every transitive dependency's security, and the entire supply chain's integrity. A vulnerability in a transitive dependency gets fixed when Canonical pushes an update, but Canonical might not know about the vulnerability (it could be in a small library), might deprioritize it (not critical for Ubuntu's primary use cases), or the patch might not be released for weeks.
With a source-built image, you choose exactly which packages to include and can patch your specific image immediately, rather than waiting for Canonical to act.
The Source-Built Philosophy: Maximum Control, Maximum Responsibility
Source-built images start from scratch or from minimal verified sources:
# Alpine-based approach (minimal Linux)FROM alpine:3.19 # Explicitly install only what you needRUN apk add --no-cache \ python3=3.11.8-r1 \ py3-pip=23.3.1-r0 COPY app.py /app/WORKDIR /appCMD ["python3", "app.py"]Or, taken to the extreme, with a fully reconstructive approach:
# Start with a builder stageFROM debian:bookworm as builder RUN apt-get update && apt-get install -y \ build-essential \ wget # Download Python sourceRUN wget https://python.org/ftp/python/3.11.8/Python-3.11.8.tgzRUN tar xzf Python-3.11.8.tgz && cd Python-3.11.8 && ./configure --prefix=/opt/python && make install # Build stage complete: /opt/python contains the compiled Python binary # Final stage: minimal base (Alpine, distroless, or scratch)FROM alpine:3.19 # Copy ONLY the compiled Python from builderCOPY --from=builder /opt/python /opt/python ENV PATH="/opt/python/bin:$PATH"COPY app.py /app/WORKDIR /appCMD ["python3", "app.py"]The result is extremely small and contains exactly what you specified—nothing more, nothing less.
Where Source-Built Excels
Source-built approaches offer several significant advantages. You know exactly what is in your image because you assembled it layer by layer, providing clear provenance. Running the same build twice produces bit-for-bit identical images, ensuring reproducibility. The attack surface is minimal because fewer packages mean fewer vulnerabilities. Rapid patching becomes possible; update a dependency version in your build and redeploy within minutes. Compliance becomes easier to document for auditors since the image is exactly what you compiled.
Where Source-Built Is Hard
However, source-built approaches come with challenges. Build complexity increases significantly since you maintain build scripts for every dependency. Compatibility can be an issue; a compiled-from-source Python might behave slightly differently than the Ubuntu-provided Python. Performance may suffer; without package manager optimization, you might end up with larger binaries. The maintenance burden is high; when a dependency releases a patch, you need to rebuild and test. Finally, skill requirement is elevated; you need expertise in build systems, compilation, and dependency management.
Side-by-Side Comparison: 10+ Dimensions
Dimension | Strip-Down | Source-Built | Winner |
|---|---|---|---|
Attack Surface | Larger (100s of packages) | Minimal (10-50 packages) | Source-Built |
Provenance Documentation | Assumed trust in base | Full build recipe visible | Source-Built |
Reproducibility | Base image tag only | Full source control | Source-Built |
Image Size | 100-300 MB | 30-100 MB | Source-Built |
Build Time | 2-5 min (pull + optimize) | 10-30 min (compile from source) | Strip-Down |
Maintenance Effort | Low (trust upstream) | High (manage all deps) | Strip-Down |
Hidden Vulnerability Risk | High (unknown unknowns) | Low (explicit only) | Source-Built |
FIPS/Compliance | Depends on base; limited control | Full control over crypto libs | Source-Built |
Patch Response Time | 3-7 days (wait for base patch) | 15 minutes (rebuild + test) | Source-Built |
Operational Complexity | Low (use vendor images) | High (maintain build pipeline) | Strip-Down |
Regression Risk | Medium (removing packages can break) | Low (explicit composition) | Source-Built |
Why Strip-Down Leaves Inherited Trust
The Problem with "Good Enough" Removals
Consider the typical strip-down workflow:
FROM ubuntu:22.04RUN apt-get update && apt-get install -y curl git python3# ... use curl and git in build ...RUN apt-get remove -y curl gitRUN apt-get autoremove -yRUN apt-get cleanYou have removed curl and git, and autoremove tried to clean up their dependencies. However, autoremove only removes packages that were installed as dependencies of other packages and are no longer needed by any installed package. If curl depends on libcurl3, and python3 also depends on libcurl3, then apt-get autoremove will not remove libcurl3. Your image still has libcurl even though you explicitly tried to remove curl. Moreover, libcurl3 depends on openssl, which depends on ca-certificates, which depends on further components. You have removed 2 packages but left 30.
The Unknown Unknowns
A critical challenge is this: you do not know what is in the base image. Running docker run ubuntu:22.04 dpkg -l | wc -l shows Ubuntu's base image includes 123 packages. Ubuntu developers included these because they are needed for some common workload. However, for your specific application, perhaps you need only 10 of them. Over time, Ubuntu might add more packages (new releases add more dependencies) or remove some (packages become obsolete). Your image inherits these changes. Your application might break on a base image update you do not control.
Real-World Examples: Why Source-Built Wins in Crisis
Example 1: The Apache Struts Vulnerability
In 2017, Apache Struts had a critical remote code execution vulnerability (CVE-2017-5645). Organizations using Struts had 48 hours to patch before active exploitation.
With the strip-down approach, teams must wait for your operating system vendor like Ubuntu or RHEL to patch the Struts package in their repositories (2-7 days), rebuild your image with apt-get update (15 minutes), test the image (2-4 hours), deploy (1-2 hours after approval), for a total minimum of 2-7 days.
With the source-built approach, teams update the Struts version in your build manifest (5 minutes), rebuild image from source (10-20 minutes), run tests (15-30 minutes), deploy (15 minutes), for a total of 30-60 minutes.
Organizations using source-built approaches patched within 1 hour. Those using strip-down images waited for vendor patches.
Example 2: The Python Supply Chain Attack
In 2023, attackers compromised a legitimate Python package in PyPI and injected malware. The malicious package was available for 6 hours before removal.
With the strip-down approach, the security team notices the malicious package, checks if it is in any production images, and if found, waits for the next base image update or triggers a rebuild (4-24 hours), then redeploys (1-2 hours).
With the source-built approach, teams update your package manifest to remove the malicious version (5 minutes), rebuild image (5-20 minutes), deploy (15 minutes), and optionally audit the entire build history to see which images included it (automated logs).
Hybrid Approaches: Best of Both Worlds?
No organization uses pure source-built images for everything. The cost is too high. However, you do not have to choose one approach universally.
Strategy 1: Source-Built Runtime, Strip-Down Build Images
# Build stage: Use standard images (faster builds)FROM node:18 as builderRUN npm installRUN npm run build # Runtime stage: Source-built or minimal imageFROM distroless/nodejs18-nonrootCOPY --from=builder /app/dist /appCMD ["app.js"]You accept some risk in the build stage (it is not deployed, so fewer concerns) but eliminate risk in the runtime.
Strategy 2: Strip-Down with Aggressive Scanning
Do not rely on apt-get autoremove. Use explicit scanning to find hidden packages:
# Find all installed packagesdocker run myimage dpkg -l | awk '{print $2}' > installed.txt # Scan against vulnerability databasestrivy image myimage # Audit dependenciesdocker run myimage ldd /bin/python | grep '\.so' | awk '{print $1}'You are still using a strip-down base, but you are explicitly validating what is in it.
Strategy 3: Source-Built with Caching
Use multi-stage builds to cache compiled binaries, reducing rebuild time:
# Dependencies stage (rarely changes)FROM alpine:3.19 as dependenciesRUN apk add --no-cache build-essentialRUN apk add --no-cache python3=3.11.8-r1 ... # Application stage (changes frequently)FROM dependenciesCOPY app.py /app/CMD ["python3", "app.py"]Rebuild the dependencies stage infrequently; rebuild the application stage often.
When to Choose Each Approach
Choose Strip-Down When:
Development images need fast iteration (rebuild time less than 5 minutes is critical). Non-critical services have low security requirements (internal tools, demo apps). Complex applications require libraries that are hard to compile from source (certain ML frameworks, proprietary codecs). Compliance overhead of source-built is not justified (low-risk internal services). Team expertise is weak in build systems and you would rather trust vendor patching.
Choose Source-Built When:
Production services handling sensitive data need minimal attack surface. Compliance requirements demand full provenance documentation (SOC 2, ISO 27001, regulated industries). Patch response time is critical and you cannot wait for vendor updates. Vulnerability risk is high (security tools, authentication systems, public APIs). Your team has expertise in build systems and is willing to maintain the complexity. Reproducibility is essential (financial services, security-critical systems).
Choose Hybrid When:
Most of your images are non-critical, so strip-down is fine for 80% of services. Sensitive services (authentication, payment processing) need source-built rigor. Build time versus runtime security trade-off allows you to optimize build images while securing runtime images.
CleanStart as a Source-Built Example
CleanStart implements the source-built philosophy: every image is composed from verified source code, compiled deterministically, and published with full provenance (SBOM, build attestations, signatures).
A CleanStart Python image:
FROM cleanstart/python:3.12-prod COPY app.py /app/CMD ["python3", "app.py"]This is simpler than managing your own source-built image, but it gains all the source-built benefits. The image has minimal surface area with only packages needed for Python runtime. Verifiable provenance is included with image SBOMs, build logs, and signatures. Rapid patching is possible; if a CVE is detected, CleanStart rebuilds within hours, not days. Compliance-ready outputs include FIPS 140-3 variants, SBOMs, and attestations. There are no hidden dependencies; exactly what you see in the Dockerfile is what is in the image.
Other organizations implement similar approaches. Distroless images (Google's approach) are pre-built source-composed images without package managers. Wolfi (Chainguard's approach) is Alpine-inspired minimal Linux with source-built guarantees. UBI (Red Hat's approach) is strip-down but with strong vendor support and rapid patching. All of these take the source-built philosophy but pre-package it so you do not have to manage the complexity yourself.
The Security Principle: Provenance Over Trust
The underlying security principle is simple: provenance (knowing how something was made) is more secure than trust (assuming it was made correctly).
With strip-down, you trust that the vendor knows what packages are safe, the vendor's dependency resolution is correct, no hidden packages are included, and patches will be released quickly. That is a lot of trust.
With source-built, you know exactly what is in the image, can verify the components independently, can patch immediately if needed, and can reproduce the image identically. These are not about having stronger security tools; they are about reducing the number of assumptions you are making.
Decision Framework
Before choosing, ask yourself these questions:
- How security-critical is this service? For high security, choose source-built or vendor source-built (CleanStart, Distroless, Wolfi). For medium security, choose strip-down with aggressive scanning and rapid patching capability. For low security, strip-down is fine.
- How fast does your team need to patch? If you need to patch in hours, source-built allows rebuild and deploy in your control. If you can accept days, strip-down is acceptable while waiting for vendor patches.
- Do you have compliance requirements? If yes, source-built or vendor source-built is necessary for provenance requirement. If no, strip-down is simpler.
- What is your risk tolerance for unknown vulnerabilities? Low risk tolerance requires source-built. Medium-high risk tolerance makes strip-down acceptable.
Conclusion: The Spectrum, Not Binary Choice
Strip-down versus source-built is not a binary choice; it is a spectrum. Most organizations use source-built or vendor source-built for security-critical production services, hybrid approaches for medium-security services (source-built runtime, strip-down build), and strip-down for internal and non-critical services.
The key insight is this: inherited trust (strip-down) scales poorly. As your containerized infrastructure grows, the attack surface grows, hidden dependencies multiply, and patching becomes reactive. Source-built approaches scale better operationally because they are explicit and reproducible.
If your organization is building a security-critical supply chain, source-built or vendor source-built images should be the foundation. Strip-down is fine for development, testing, and non-critical workloads.
Next Steps
Understand the technical foundations of minimal images and how they are built. Read Standard vs Distroless Images to compare standard Linux distributions with stripped-down variants. Explore What is a Container Image? to understand the structure and layers that make images composable. Delve into Container Image Layers: Deep Dive to learn how layers are used in both approaches.
Then explore how organizations implement source-built approaches in practice. Read Build Stage Security to understand securing the build process itself. Study Verified Source Philosophy for the architectural foundation of source-built images. Review The 11 Build Artifacts to see what gets produced during a secure build and why each artifact matters.
