Container Image Layers: A Deep Dive Into What's Inside Your Images

Watch the Lesson

You run docker build, and each instruction creates a new layer. RUN apt-get install, layer. COPY app.py, layer. ENV VAR=value, no layer (it's metadata). But what exactly IS a layer? What data does it contain? How does the operating system stack them into a single filesystem? And why does the order of your Dockerfile instructions matter so much? This guide explains the engineering behind image layers — the foundation of container efficiency, caching, and security.

Table of Contents

What a Layer Actually Is
How Dockerfile Instructions Create Layers
Understanding UnionFS and OverlayFS
The Writable Container Layer
Layer Sharing Between Images
Layer Caching in Builds
Whiteout Files: How Deletions Work
The Security Implication of Layers
Tools for Inspecting Layers
Size Optimization Strategies
Next Steps

The following diagram shows how Dockerfile instructions map to layers and how layer caching and inheritance work:

1Build Process2Layer Caching

What a Layer Actually Is

A layer is a compressed tarball containing filesystem changes — files added, modified, or removed since the previous layer.

Here's what actually exists on disk in a registry or locally: Layer files are stored as compressed tarballs. The file layer-1.tar.gz (approximately 80 MB when compressed) contains the base OS filesystem with directories like /bin/, /lib/, /usr/, /etc/, /var/, and thousands of files that make up the base operating system.

The file layer-2.tar.gz (approximately 50 MB compressed) contains Python runtime files including /usr/bin/python3, the entire /usr/lib/python3.9/ directory, /usr/share/doc/python3/, and all the libraries and runtime components that Python requires.

The file layer-3.tar.gz (approximately 2 MB compressed) contains your application code including /app/app.py, /app/requirements.txt, and other application-specific files.

Not the delta, the actual files: Each layer's tarball contains the complete set of files that exist at that point in the stack. Layer 1 has the OS files. Layer 2 has OS files PLUS Python (even though OS files haven't changed). Layer 3 has OS files PLUS Python PLUS your app.

But that sounds wasteful — aren't you duplicating OS files across layers? Yes, you are in the tarball. But here's the key: when the runtime stacks layers, it uses the kernel's filesystem features (UnionFS or OverlayFS) to present a unified view without duplication.

How Dockerfile Instructions Create Layers

When you build an image, Docker processes your Dockerfile line by line:

FROM ubuntu:22.04              # ← Pulls existing image (base layers 1-20)RUN apt-get update             # ← Creates layer 21RUN apt-get install -y python3 # ← Creates layer 22COPY app.py /app/              # ← Creates layer 23WORKDIR /app                   # ← No layer (metadata only)ENV PYTHONUNBUFFERED=1         # ← No layer (metadata only)CMD ["python3", "app.py"]      # ← No layer (metadata only)

Which instructions create layers?

Instruction	Creates Layer	Why
`FROM`	No	Pulls existing layers from base image
`RUN`	Yes	Executes command, captures filesystem changes
`COPY`	Yes	Adds files; creates layer with those files
`ADD`	Yes	Like COPY but with remote URL support
`WORKDIR`	No	Updates metadata (working directory)
`ENV`	No	Updates metadata (environment variables)
`EXPOSE`	No	Updates metadata (exposed ports)
`LABEL`	No	Updates metadata (labels)
`CMD`	No	Updates metadata (default command)
`ENTRYPOINT`	No	Updates metadata (entrypoint)
`USER`	No	Updates metadata (user)
`VOLUME`	No	Updates metadata (volumes)

Practical example:

FROM python:3.11-slim          # Base image: 5 layers RUN apt-get update && \    apt-get install -y curl && \    rm -rf /var/lib/apt/lists/*  # Layer 6 (one RUN = one layer) COPY requirements.txt /tmp/     # Layer 7RUN pip install -r /tmp/requirements.txt && \    rm /tmp/requirements.txt    # Layer 8 COPY app.py /app/               # Layer 9COPY config.yaml /app/          # Layer 10 WORKDIR /appENV DEBUG=falseCMD ["python3", "app.py"]       # Metadata only

Result: 10 layers (5 from base + 5 new).

Understanding UnionFS and OverlayFS

The following diagram illustrates how OverlayFS stacks read-only lower layers with a writable upper layer to create a unified filesystem view:

1Unified View (Container Sees)2OverlayFS Stack

Here's where layers become elegant. The kernel provides filesystem features that stack multiple directories into a single unified view.

UnionFS (Generic)

UnionFS is the old approach that stacks multiple directories to create a unified view. From the container's perspective, a single filesystem is presented with /bin/bash, /lib/libc.so, /usr/bin/python3, /etc/passwd, /app/app.py, and other files all appearing in one view.

Behind the scenes, these files come from different layers stacked in order from top to bottom. Layer 3 (writable) at /tmp/layer3/ contains files added by COPY operations such as /app/app.py. Layer 2 at /tmp/layer2/ contains Python runtime files added by RUN operations like /usr/bin/python3. Layer 1 at /tmp/layer1/ is the base OS layer containing /bin/bash, /lib/libc.so, and other system files.

When you ask for /app/app.py, the kernel searches top to bottom. It first checks layer 3 (writable) and finds it there, returning it. If not found, it would check layer 2, then layer 3, and if not found anywhere, return "file not found". When you write a file to /tmp/newfile.txt, it's written to layer 3 (the writable layer) only.

OverlayFS (Modern)

OverlayFS is the modern approach that Linux 3.18+ includes native support for. It's similar to UnionFS but more efficient. From the container's perspective, a unified view is presented with /bin/bash, /lib/libc.so, /usr/bin/python3, /app/app.py, /tmp/newfile.txt (created by the container), and other files all appearing as one filesystem.

The backing structure for OverlayFS organizes layers as follows: /lower0 contains the readonly base OS layer, /lower1 contains the readonly Python layer, /lower2 contains the readonly app.py layer added via COPY, /upper is a writable layer for container modifications, and /work is an internal work directory. Linux stacks all these directories together and mounts them as a single merged filesystem that the container sees as one unified view.

Linux stacks /lower0, /lower1, /lower2, and /upper using the overlay filesystem driver, with the kernel handling the logic of which file comes from which layer.

The key insight: Lower layers are read-only and shared between containers. The upper layer (writable) is per-container. If 100 containers run the same image, they share all the lower layers (read-only) and each has its own upper layer (writable).

The Writable Container Layer

When you run docker run, Docker creates a new writable layer on top of the image layers. The image itself contains readonly layers: Layer 1 is the base OS, Layer 2 contains Python, and Layer 3 contains app.py. These three image layers are readonly and shared across all containers that use this image. On top of these read-only image layers, OverlayFS stacks a container layer (upper), which is writable and specific to each running container instance.

Every time the container writes a file, it goes to the container layer. Reading a file checks the container layer first, then image layers.

Example: If the base image has /etc/hostname with content "docker", and the container reads it, the kernel checks the container layer first (not found), then the image layers (found in layer 1), and returns "docker".

If the container writes /etc/hostname with "myapp", it gets written to the container layer. Now the file is in both places, but the container sees its own version (from container layer).

Critical: The container layer is ephemeral. When the container stops, the layer is deleted. If you write data expecting it to persist, it's lost.

This is why volumes exist: you mount a path outside the container layer to make it persistent.

Here's the power of the layer architecture. Image A (ubuntu + Python + Flask) contains three layers: the base OS, Python, and Flask. Image B (ubuntu + Python + Django) contains the same base OS layer and Python layer but with Django instead of Flask. Image C (ubuntu + Node.js) shares the same base OS layer but uses Node.js instead of Python.

On disk, registries use content-addressable storage where blobs are identified by sha256 hash. The base OS layer (sha256:aaa...) is referenced by all three images (A, B, and C). The Python layer (sha256:bbb...) is referenced by images A and B. Flask (sha256:ccc...) is used only by image A. Django (sha256:ddd...) is used only by image B. Node.js (sha256:eee...) is used only by image C. This structure means the base OS layer is stored once on disk but reused by all three images, and identical layers between images automatically share storage.

Storage efficiency: The base OS layer is stored once but used by three images. If you pull all three images, you only download the base OS once.

This is one reason Docker and registries use content-addressable storage: layers are identified by sha256 digest, not by name. Identical content produces identical digests, enabling automatic deduplication.

Layer Caching in Builds

When you run docker build, Docker uses cache to speed up rebuilds.

First build:

FROM ubuntu:22.04              # Pull base (20 layers)RUN apt-get install -y curl    # Create layer 21RUN apt-get install -y python3 # Create layer 22COPY app.py /app/              # Create layer 23

Build time: 2 minutes (downloading, installing, copying).

Second build (no changes):

FROM ubuntu:22.04              # Reuse base (20 layers) - instantRUN apt-get install -y curl    # Reuse layer 21 - instantRUN apt-get install -y python3 # Reuse layer 22 - instantCOPY app.py /app/              # Reuse layer 23 - instant

Build time: 0.1 seconds (only reading cache).

Third build (only app.py changed):

FROM ubuntu:22.04              # Reuse base (20 layers) - instantRUN apt-get install -y curl    # Reuse layer 21 - instantRUN apt-get install -y python3 # Reuse layer 22 - instantCOPY app.py /app/              # Invalidate! (file changed, rebuild layer 23)

Build time: 0.5 seconds (reuse 22 layers, rebuild 1).

Fourth build (installed a new package):

FROM ubuntu:22.04              # Reuse base (20 layers) - instantRUN apt-get install -y curl    # Reuse layer 21 - instantRUN apt-get install -y python3 && \    apt-get install -y vim     # Invalidate! (RUN command changed, rebuild)                               # Now layer 22 is newCOPY app.py /app/              # Invalidate! (layer 22 changed, must rebuild)

Build time: 2 minutes (reuse 20 layers, rebuild 2).

Why instruction order matters: If you put COPY app.py before RUN pip install, every time you change app.py, the build invalidates the pip install cache. You'll spend 2 minutes reinstalling packages for a one-line code change.

Better order:

FROM python:3.11-slimCOPY requirements.txt /tmp/    # Relatively staticRUN pip install -r /tmp/requirements.txt  # Cached until requirements changeCOPY app.py /app/              # Frequently changesCMD ["python3", "app.py"]

Now changing app.py doesn't invalidate the pip cache.

Whiteout Files: How Deletions Work

Here's a tricky aspect: you can't actually delete files from a lower layer.

Imagine layer 1 contains a 500 MB file you don't need:

Layer 1: /usr/share/doc/big-library/docs.tar (500 MB)

You want to remove it:

FROM ubuntu:22.04RUN rm /usr/share/doc/big-library/docs.tar

Does the 500 MB file disappear? No. It's still in layer 1. You can't delete from a readonly layer.

Instead, Docker creates a whiteout file:

Layer 1: /usr/share/doc/big-library/docs.tar (500 MB)Layer 2: .wh.docs.tar (0 bytes, special whiteout marker)

When the kernel stacks layers, it sees the whiteout marker and hides the file from layer 1. The file is still stored (waste), but hidden.

This has profound implications for image design and security. If you have a secret in layer 1 and delete it in layer 2, the secret is still in the image. Anyone with access to the image can extract layer 1 and read it. Similarly, size optimization fails if you download a 500 MB package and delete it. The image is still 500 MB larger, with the file just hidden, not actually removed.

Solution: Multi-stage builds:

# Build stageFROM ubuntu:22.04RUN apt-get install -y build-essential (200 MB)COPY source.tar /src/RUN cd /src && make # Runtime stageFROM ubuntu:22.04COPY --from=builder /src/app /app/# build-essential is not included (0 MB)

Build artifacts are in the builder stage, which is discarded. The runtime stage is small because it doesn't inherit build tools.

The Security Implication of Layers

Critical: Do not assume that deleting a secret makes it gone.

If you do:

FROM baseRUN echo "SECRET_KEY=abc123" >> /app/config.shRUN rm /app/config.sh

The secret is in one of your layers. Anyone with access to the image can extract that layer and read it.

# Attacker does this:docker pull myimage:v1.0docker run --rm -v /tmp:/out myimage:v1.0 sh -c \  "tar xf /var/lib/docker/overlay2/.../layer.tar && grep -r SECRET"

The secret is found.

Best practices:

Several critical practices protect image security. First, never include secrets in images. Use environment variables, volumes, or secret management services instead. Second, use multi-stage builds to minimize layers in the final image. Third, scan images for secrets using tools like Trivy, Snyk, and TwistLock to detect hardcoded secrets. Fourth, build with metadata only, keeping build tools and artifacts out of the runtime image.

Tools for Inspecting Layers

Several tools help you understand what's in your image's layers:

dive

dive shows you every file in each layer and what's added or removed:

dive myimage:latest

Output shows the layer list (size, command that created it), file tree (what's in the current layer), and detailed file view (filesystem changes per layer).

crane

crane is from the Go container tools ecosystem and lets you inspect manifests and layers:

# See manifestcrane manifest myimage:latest # See configcrane config myimage:latest # See layer digest and sizecrane config myimage:latest | jq '.rootfs.diff_ids'

skopeo

skopeo is a multi-registry tool for inspecting, copying, and signing images:

# Inspect imageskopeo inspect docker://myimage:latest # Copy image between registriesskopeo copy docker://docker.io/ubuntu:22.04 docker://myregistry/ubuntu:22.04 # See layersskopeo inspect --config docker://ubuntu:22.04 | jq '.rootfs.diff_ids'

docker inspect

docker inspect is built into Docker and provides layer information:

# See all layersdocker inspect myimage:latest | jq '.[0].RootFS.Layers' # See image configdocker inspect myimage:latest | jq '.[0].Config' # See image history (which Dockerfile commands created which layers)docker history myimage:latest

Size Optimization Strategies

Understanding layers is key to optimizing image size.

Strategy 1: Combine RUN Commands

Bad (multiple layers):

RUN apt-get updateRUN apt-get install -y curlRUN apt-get install -y python3RUN apt-get install -y vimRUN rm -rf /var/lib/apt/lists/*

Each RUN is a layer. The rm command creates a separate layer with whiteouts, not actual deletions.

Good (single layer):

RUN apt-get update && \    apt-get install -y curl python3 vim && \    rm -rf /var/lib/apt/lists/*

One RUN = one layer. The cleanup happens in the same layer, so files are actually deleted from the layer (not just whiteout).

Strategy 2: Multi-Stage Builds

Bad (build tools in final image):

FROM golang:1.20COPY . /src/RUN cd /src && go build -o appCMD ["./app"]# Final image: 900 MB (includes Go compiler)

Good (build tools discarded):

FROM golang:1.20 AS builderCOPY . /src/RUN cd /src && go build -o app FROM scratchCOPY --from=builder /src/app /appCMD ["/app"]# Final image: 10 MB (only the binary)

Strategy 3: Use Lean Base Images

Base Image	Size	Use Case
`scratch`	0 MB	Static binaries only
`alpine`	7 MB	Minimal everything
`debian:bookworm-slim`	50 MB	Lightweight + tools
`ubuntu:22.04`	77 MB	Standard tooling
`python:3.11-slim`	150 MB	Python + lean OS
`python:3.11`	900 MB	Python + full toolchain

Choose based on what you need.

Strategy 4: Pin Versions and Use Layer Caching

# Bad (cache busted, rebuilds everything):FROM ubuntu:latestRUN apt-get update && apt-get install -y curlCOPY . /app/ # Good (versioned, cacheable):FROM ubuntu:22.04RUN apt-get update && apt-get install -y curl=7.81.0-1ubuntu1.15COPY . /app/

Pinned versions mean layers don't change unless you explicitly change them. The cache stays valid.

Strategy 5: Minimize Layer Count

More layers mean slower pulls and starts. Every 20+ layers add overhead.

Combine related operations into fewer RUNs and use multi-stage builds to keep the final image lean.

Next Steps

Layers are fundamental to container efficiency, security, and reproducibility.

Key takeaways that summarize the essential concepts: Layers are compressed tarballs stacked via OverlayFS or UnionFS. Each Dockerfile RUN, COPY, or ADD instruction creates a layer. Docker caches layers; instruction order matters for build speed. Deletions don't remove files (whiteouts hide them); use multi-stage builds instead. Multiple images share identical layers through content-addressable storage. Secrets in layers are extractable; never hardcode secrets. Use dive, crane, skopeo, and docker inspect to investigate layers.

Practical next steps guide you through hands-on learning:

Run docker history on an image to see which command created each layer
Use dive to explore a real image's layer contents
Rebuild an image with fewer RUN commands and measure the size difference
Create a multi-stage build for an application you maintain
Use docker inspect to verify layer digests (confirms reproducibility)

Key Concepts Summary

Understanding container image layers requires grasping several interconnected ideas. A layer is a compressed tarball containing filesystem snapshot at a point in the build process. Layers are stacked using OverlayFS, where read-only lower layers combine with a writable upper layer to create a unified view. RUN, COPY, and ADD instructions create layers, while ENV, WORKDIR, and CMD do not. Layer caching speeds up rebuilds, making instruction order critical for build performance. Deletions don't actually remove files; instead, whiteout markers hide them, which is why multi-stage builds are necessary for proper cleanup. Multiple images share identical layers automatically through content-addressable storage, creating efficient storage. Secrets in layers are never truly deleted and can be extracted by anyone with image access, so environment variables and secrets management should be used instead. Tools like dive, crane, and skopeo let you inspect layer contents and understand what's inside your images. Understanding layers is essential for optimization, security, and debugging container builds and deployments.