When you run docker build, you create a single image that includes everything: the source code, the compiler, the debugger, the package manager, and the compiled binary. This is convenient for development but catastrophic for production security. A 2 GB development image includes numerous tools you'll never use in production such as the C compiler, Python test frameworks, git, curl, and man pages. An attacker who compromises the running application can use these tools to escalate attacks. Additionally, a bloated image pulls slower across networks, scans longer for vulnerabilities, and provides more surface area for potential exploitation.
The solution is the two-image pattern: compile in a large development image containing all necessary build tools, then extract only the binary and run it in a tiny production image containing just the runtime dependencies.
1Anti-Pattern: One Image2Best Practice: Multi-StageThe One-Image Anti-Pattern
A naive Dockerfile builds directly in the final image by starting with ubuntu:22.04, running apt-get to install build-essential, python3-dev, git, curl, and vim (100+ MB), copying app.py, requirements.txt, and main.py, installing dependencies with pip, and using python as the entrypoint.
Several serious problems result from this approach. The image is huge at 500 or more MB, including tools that are never used at runtime, causing every deployment to pull 500 MB instead of 50 MB and making pulls unnecessarily slow. Scanning 500 MB of packages for vulnerabilities becomes difficult and resource-intensive, straining security scanning infrastructure. The attack surface expands significantly because if the application is compromised, the attacker has access to gcc, git, and other tools for escalation. Source files leak into the image, which production images should never contain, risking intellectual property exposure. Finally, changing application code requires reinstalling development tools during the build process, making rebuilds slow and inefficient.
This pattern is unfortunately common in small projects and tutorials, often because developers prioritize simplicity over production readiness.
The Two-Image Pattern (Multi-Stage Build)
The solution separates build and runtime images. The builder stage includes everything needed to compile or build the application, including the compiler, build tools, and development libraries. The runtime stage includes only what's needed to execute the application at runtime. The result is dramatic: the build stage creates a 500 MB image containing Python, the compiler, and all build tools. The artifacts are then extracted from this stage and placed into the runtime stage, which requires only 50 MB to contain Python, the production dependencies, and the application code. The final image is just 50 MB—10 times smaller than the builder stage—with zero development tools present to complicate security.
Compiled Languages: The Extreme Case
For compiled languages (Go, Rust, C), the benefit is even more dramatic.
Go Example
The builder stage includes Go compiler (500+ MB), C compiler (100+ MB), build tools (git, make, etc.), Go standard library and dependencies. The runtime stage includes only the compiled binary (5-10 MB for a typical application). Using scratch (an empty base image) as the runtime works because the Go binary is statically linked and doesn't need libc, glibc, or any OS libraries.
Final image size: 5-10 MB (down from 500+ MB).
Rust Example
The builder stage includes Rust compiler, Cargo package manager, and all dependencies. The runtime stage uses debian:bookworm-slim (a minimal Debian) or distroless. Final image: ~100 MB (Debian slim + binary). Could be smaller with distroless or scratch, but Debian slim provides libc for compatibility.
What Goes in Each Image
Development Image
The development image is large and includes everything needed to successfully build the application. This encompasses the compiler, interpreter, or SDK for the language; build tools such as make, gradle, npm, or cargo for orchestrating compilation; package managers like apt, yum, npm, or pip for installing dependencies; version control tools like git for accessing source code; debugging tools such as gdb or lldb for troubleshooting; testing frameworks for running unit and integration tests; documentation tools; and optionally IDE tools, Vim, curl, and similar utilities for developer convenience.
Example for Node.js shows the Node compiler, npm, and build-essential, all installed. This image is 500+ MB, includes the Node compiler, npm, Python (for some native modules), and everything else.
Runtime (Production) Image
The runtime image is minimal and includes only what's strictly needed to execute the application. This means a minimal base operating system such as Alpine, distroless, or scratch as the foundation; the language runtime if applicable (such as Python, Node.js, or Java runtime environments, not the development SDKs); the application binary or compiled code; essential configuration files; and nothing else—no build tools, no development libraries, and no unnecessary utilities.
Example for Node.js shows the image using node:18-alpine, with only compiled JavaScript, package.json, package-lock.json, and production dependencies. This image is 150-200 MB (much smaller than 500+ MB).
For Python, the development stage includes python:3.11, build tools, and requirements-dev.txt with pytest, black, and mypy. The runtime stage uses python:3.11-slim, copies only production dependencies, and includes app.py. The final image excludes test frameworks (pytest gone), linting tools (black, mypy gone), and build tools (gcc, etc. gone), with only Python runtime plus production dependencies plus app code at ~150 MB total.
The Build Artifact Flow
Multi-stage builds follow a predictable pattern that maximizes both build efficiency and runtime security. Stage 1 (Builder) starts with a large base image containing all necessary compilers and tools. The builder stage copies the source code, runs the build command(s) to compile the application, and generates the final artifact such as a binary, bundle, or compiled code. The artifact is then carefully extracted and copied into Stage 2 (Runtime), which uses a minimal base image and performs only the essential setup needed for runtime operation. The result is a final image that is small, clean, and production-ready.
The key advantage is that developer stage artifacts including source code, compiler output, and build logs are completely discarded and do not appear in the final image. Only the final compiled or built artifact is kept in the runtime image, ensuring that production has no unnecessary files or tools.
Distinguishing Dev and Prod Dependencies
Applications often have two distinct types of dependencies that serve different purposes and should be handled differently in multi-stage builds. Production dependencies are code that runs in production and must be shipped with the application for it to function. Development dependencies are code only needed during development for testing, linting, code formatting, type checking, and other development-time activities that do not need to be present in production.
Python
The builder stage includes pytest, black, mypy, and coverage. The builder runs tests with pytest, type checking with mypy, and compilation to bytecode. The runtime stage includes only flask, requests, and pydantic from requirements.txt. The final image has zero testing/linting tools.
Node.js
The builder stage installs ALL dependencies (production + dev) using npm ci. The builder runs linter, tests, and TypeScript compilation. The runtime stage installs only production dependencies using npm ci --only=production. The final image excludes all devDependencies from jest, typescript, @types/express, and eslint.
Java
The builder uses maven:3.8-openjdk-17, runs mvn clean package -DskipTests. The runtime uses openjdk:17-jdk-slim. The builder includes Maven and the full JDK; the runtime includes only the slim JDK.
Caching and Build Performance
Multi-stage builds interact with Docker's layer caching in important ways that can significantly impact build performance. Each RUN command creates a layer that Docker can independently cache. In a typical build, RUN go mod download creates Layer 1 which is cached unless go.mod changes, COPY . . creates Layer 2 which is invalidated if any file changes, and RUN go build ... creates Layer 3 which must re-run if Layer 2 changed.
When you only change main.go, Layer 1 (go mod download) achieves a cache hit because go.mod hasn't changed, Layer 2 (COPY) misses the cache because main.go changed, and Layer 3 (go build) must re-run because Layer 2 was invalidated. The go mod download is cached, so the overall build is fast despite the source code change.
When you change a dependency in go.mod, Layer 1 (go mod download) misses the cache because go.mod changed, Layer 2 (COPY) must re-run, and Layer 3 (go build) must re-run. This is slower overall, but correct—the dependencies must be re-downloaded and the application must be recompiled to incorporate the dependency change.
Ordering for Build Speed
Put expensive operations early, cheap operations late. A fast build iteration example starts with copy go.mod and go.sum, then runs go mod download, then copies the application code, then runs go build. When you change application code (common during development), only the last layer rebuilds. The dependency layer is cached.
If you reverse the order with a slow pattern, every change to application code invalidates the dependency layer, which re-runs go mod download—a slow operation.
When NOT to Use Multi-Stage Builds
Multi-stage builds are almost always beneficial for production images, but a few edge cases exist where they provide minimal benefit. For trivial applications consisting of a single interpreted script with no dependencies or build step, multi-stage doesn't help much because there's nothing to discard—the script itself is the artifact. Similarly, for development images used exclusively in local development environments, you might want a single image with all tools pre-installed for convenience, allowing developers to compile, test, and debug without additional setup. This approach is fine for development containers where size and security are less critical, but such development images should absolutely never be used in production deployments.
Variant Base Images: -dev, -slim, -alpine
Major base images offer variants optimized for different purposes in the build lifecycle. The python:3.11 full variant (500+ MB) includes build tools, documentation, and extras, making it suitable for builder stages where compilation might require additional utilities. The python:3.11-slim variant (150 MB) includes a minimal package manager with most documentation removed, making it appropriate for runtime stages if your application needs glibc and standard system libraries. The python:3.11-alpine variant (50 MB) uses musl libc instead of glibc with no package manager and minimal content, suitable for production runtime if your application is compatible with musl.
For multi-stage builds, the builder stage should use the full image, and the runtime stage should use the slim image if needed, or Alpine/distroless if compatible.
The -dev Variant Pattern
Some images offer -dev variants explicitly for development:
# Local developmentdocker run -it python:3.11-dev bash # ProductionFROM python:3.11-slimThis approach lets developers get a full development environment while keeping production images small.
Security Benefits of Separation
Separating development and runtime improves security significantly in multiple dimensions. The production image has a smaller attack surface with fewer packages, fewer potential CVEs, and a smaller binary footprint overall. This reduced surface area directly limits what an attacker can accomplish. Reduced privileges means attackers in production can't compile code with a compiler that doesn't exist, install packages with a package manager that's not included, or debug with tools that were never shipped to production. No compiler means no arbitrary code execution beyond just running the existing application. Immutability of production images means the system is intentionally minimal so an attacker can't modify files and redeploy because the tools required for modification don't exist.
Consider a concrete example: An attacker compromises your Python application and executes arbitrary code. In a large image with development tools, the attacker can run commands like gcc -o payload payload.c to compile malicious code, pip install backdoor-package to install a compromised package, execute python -c "import socket; ..." to create a reverse shell, or git clone malicious-repo to download additional tools.
In a minimal production image, those same attacker commands completely fail: gcc: command not found, pip: command not found, git: command not found, and the python binary can only execute existing scripts—not arbitrary code from the internet. The attacker's options are severely limited, containing the damage and buying time for incident response.
Practical Example: Complete Python Application
A realistic Python web application using the two-image pattern includes a builder stage starting from python:3.11, installing build dependencies, copying dependency files, installing all dependencies (prod + dev), copying source code, and running tests and linting. The runtime stage starts from python:3.11-slim, creates a non-root user, copies only production dependencies, copies application code, sets non-root user, enables health checks, and runs the application.
Final image includes no test frameworks (pytest gone), no linting tools (black, mypy gone), no build tools (gcc, etc. gone), only Python runtime plus production dependencies plus app code, runs as non-root user, health checks enabled, and is ~150 MB total.
Summary: Two Images for Success
The two-image pattern offers multiple significant benefits that compound to improve production reliability and security. It separates concerns by keeping build complexity contained in the builder stage separate from runtime simplicity in the production stage. It reduces image size by five to ten times in production images, directly improving deployment speed and registry efficiency. It improves security with fewer packages, fewer potential CVEs, and a limited attack surface for adversaries. It speeds deployment through faster image pulls across networks and faster container startup times. It simplifies vulnerability scanning with less surface area to scan and fewer false positives from development-only packages. It enables testing by allowing comprehensive test suites to run in the build stage while the runtime image can be deployed with confidence.
Build images are intentionally large and complex, including all tools needed for compilation and testing. Runtime images are small and focused, containing only what's needed to execute. Use both together as part of a complete, secure build and deployment pipeline.
For any application with compilation or a build step, multi-stage builds are not optional—they're essential for production readiness and security.
