What is a Container Image?

Knowledge Hub

What Makes Container Images Repeatable and Auditable

Without immutable images, deployment becomes fragile: the same "name" could refer to different artifacts at different times, making debugging and security audits impossible. Container images solve this by being frozen at build time. You build an image once, tag it with a specific digest (hash), and that exact artifact can be reproduced, scanned, audited, and verified forever. Every container created from that image is identical—no surprises at runtime, no configuration drift.

A container image is a static, immutable snapshot of everything needed to run an application: code, dependencies, libraries, and filesystem configuration. Once built, it never changes. Every container created from that image is identical, which is the core promise of consistent, reproducible deployments.

graph TB    Dockerfile["Dockerfile<br/>Build Instructions"] --> Build["Build Process"]    Build --> Image["Container Image<br/>Immutable<br/>Versioned"]    Image --> Tag["Tagged<br/>myapp:v1.0.0<br/>sha256:abc123"]    Tag --> Container1["Container 1<br/>Identical Copy"]    Tag --> Container2["Container 2<br/>Identical Copy"]    Tag --> Container3["Container 3<br/>Identical Copy"]    style Tag fill:#fff9c4    style Container1 fill:#c8e6c9    style Container2 fill:#c8e6c9    style Container3 fill:#c8e6c9

The Layer Structure

Understanding layers is essential because this is what makes images reusable and efficient. A container image is like a stack of read-only folders where Layer 1 is the base OS (Ubuntu) at 80MB, Layer 2 contains system libraries from apt at an additional 100MB, Layer 3 is the Python interpreter at 250MB, Layer 4 holds configuration files at 1MB, and Layer 5 at the top is your application code at 5MB. Each layer sits on top of the previous one and represents a set of filesystem changes, with the total image size being the sum of all layers. When you run a container, Docker stacks all layers together using a union filesystem, adds a writable layer on top as the only mutable layer, and presents your application with a complete filesystem view. This layered design provides numerous benefits including reusability where 1,000 images can share Layer 1 (Ubuntu base) without duplication, efficiency since only changed layers need to be downloaded or stored, speed through build reuse of cached layers, and transparency allowing you to inspect exactly what changed in each layer.

From Dockerfile to Image: The Build Process

A Dockerfile is a text file with instructions to build an image. Each instruction creates a layer.

# Step 1: Start from a base imageFROM ubuntu:22.04 # Step 2: Run commands to install dependenciesRUN apt-get update && \    apt-get install -y python3 python3-pip # Step 3: Copy your application codeCOPY app.py /app/app.pyCOPY requirements.txt /app/requirements.txt # Step 4: Install Python dependenciesRUN pip install -r /app/requirements.txt # Step 5: Set the default commandCMD ["python3", "/app/app.py"]

Each Dockerfile instruction serves a specific purpose in building the image. The FROM instruction chooses the base image and creates Layer 1. The RUN instruction executes commands for installing packages or compiling and creates a layer, while COPY copies files from the host into the image and also creates a layer. The ADD instruction is similar to COPY but with additional capabilities including the ability to extract tarballs and fetch from URLs. ENV sets environment variables and creates a layer, as does WORKDIR which changes the working directory for subsequent instructions. EXPOSE documents which ports the application listens on but only adds metadata without creating an actual layer. CMD specifies the default command to run when the container starts and only adds metadata, while ENTRYPOINT overrides or sets the entry point and similarly creates no layer.

Build command:

# Build the imagedocker build -t myapp:1.0.0 . # Flags explained:# -t myapp:1.0.0 = tag (name and version)# . = use Dockerfile in current directory

What Docker does:

Docker reads the Dockerfile line by line. For each instruction, it creates a layer and caches it. The build process tags the final image with your name and version, then stores it locally (use docker push to upload to a registry).

The FROM Instruction: Base Images

The FROM instruction is the first line of every Dockerfile. It specifies the base image—the starting point.

Available base images include official images that are small, lightweight, and security-maintained such as ubuntu:22.04 for a full Ubuntu OS at approximately 80MB, python:3.11-slim for Python with minimal OS at approximately 200MB, python:3.11-alpine for Python plus Alpine Linux at approximately 50MB, and node:18-bullseye for Node.js plus Debian at approximately 1GB. Cloud provider images optimized for their specific platforms include Google Cloud's gcr.io/cloud-builders/docker and AWS's public.ecr.aws/docker/library/python:3.11 for AWS-specific integrations. CleanStart secure images that are hardened, pre-patched, and cryptographically signed include cleanstart/python:3.11-prod, cleanstart/node:18-prod, and cleanstart/ubuntu:22.04-prod for production deployments. Base image selection matters for several important reasons: size is affected as -slim and -alpine variants are smaller but include fewer tools for troubleshooting, security is critical as official images receive regular patches while CleanStart images undergo additional verification, features vary as some include dev tools while others are minimal, and maintainability requires using long-term support versions such as Ubuntu 22.04 LTS rather than shorter-lived versions like 23.04.

Multi-stage builds (separate base images for build vs runtime):

# Stage 1: Build environment (large, has compilers)FROM golang:1.21 as builderCOPY . /srcRUN cd /src && go build -o app # Stage 2: Runtime environment (tiny, just the binary)FROM cleanstart/ubuntu:22.04-prodCOPY --from=builder /src/app /app/appCMD ["/app/app"] # Result: Final image is tiny (~150MB instead of 2GB)# Only Stage 2 layers end up in the final image

Tags vs Digests: Versions and Guarantees

Every image has a name and version, but there are two ways to reference them.

Tags (e.g., myapp:1.0.0) are human-readable but mutable, meaning you can rebuild and push a new image with the same tag. The latest tag is risky as it can change without warning, though tags are easiest to use despite not being immutable.

# Push with tagdocker tag myapp:1.0.0 myregistry/myapp:1.0.0docker push myregistry/myapp:1.0.0 # Pull by tag (always gets latest version with that tag)docker pull myregistry/myapp:1.0.0

Digests (e.g., sha256:a3f5c6b...) are cryptographic hashes of the image contents that are immutable, meaning the same image always produces the same digest. A URL like myregistry/myapp@sha256:a3f5c6b... always pulls the exact same image, though digests are longer to type but cryptographically safe.

When you push an image with docker push myregistry/myapp:1.0.0, the output shows the digest like sha256:a3f5c6b7e8f9.... To pull by digest and guarantee the exact same image, use docker pull myregistry/myapp@sha256:a3f5c6b7e8f9....

Best practice: Tag for convenience in development and verify with digest for production deployments.

# Development: Use tagsFROM myregistry/myapp:latest # Production: Use digests (immutable guarantee)FROM myregistry/myapp@sha256:a3f5c6b7e8f9...

Container Registries: Where Images Live

A registry is a server that stores and serves container images. It's like GitHub for images.

Popular registries:

Registry	URL	Best For
Docker Hub	docker.io	Public open-source images, free tier
Google Artifact Registry	`<region>-docker.pkg.dev/<project>/...`	Google Cloud, enterprise
AWS ECR	`<account>.dkr.ecr.<region>.amazonaws.com/...`	AWS, private repos
GitHub Packages	`ghcr.io/username/...`	GitHub repos, integration
CleanStart	`registry.cleanstart.io/...`	Signed, scanned, secure images

How registries work:

# 1. Build locallydocker build -t myapp:1.0.0 . # 2. Tag it for the registrydocker tag myapp:1.0.0 gcr.io/my-project/myapp:1.0.0 # 3. Authenticate to registry (one time per registry)gcloud auth configure-docker gcr.io # 4. Push to registrydocker push gcr.io/my-project/myapp:1.0.0 # 5. Now anyone can pull itdocker pull gcr.io/my-project/myapp:1.0.0

How Images Are Built and Stored

Build process (your laptop):

$ docker build -t myapp:1.0.0 .Step 1/5 : FROM ubuntu:22.04 ---> a1c4b0e7f9c2Step 2/5 : RUN apt-get update && apt-get install -y python3 ---> Running in 4f5c6d7e8f9a ---> 3b2c4d5e6f7aStep 3/5 : COPY app.py /app/app.py ---> 2a3b4c5d6e7fStep 4/5 : RUN pip install -r requirements.txt ---> 1z2y3x4w5v6uStep 5/5 : CMD ["python3", "/app/app.py"] ---> 0a1b2c3d4e5fSuccessfully built 0a1b2c3d4e5fSuccessfully tagged myapp:1.0.0

Each Step is a layer. Docker caches them, so rebuilding is fast.

Storage (registry): When stored in a registry like gcr.io/my-project/myapp:1.0.0, the image comprises multiple layers. Layer a1c4b0e7f9c2 is the base image, which is shared with 100 other images to save space. Layer 3b2c4d5e6f7a contains packages and is shared with other Python images. Layer 2a3b4c5d6e7f is your code and is unique to this image. Finally, layer 1z2y3x4w5v6u contains dependencies and is shared across multiple images. This layering design means the registry stores each layer only once, even if it's used by multiple images.

Example Dockerfile: Python Flask App

Here's a complete, production-ready Dockerfile:

# Multi-stage buildFROM cleanstart/python:3.11-dev as builderWORKDIR /build # Copy requirements and installCOPY requirements.txt .RUN pip install --user --no-cache-dir -r requirements.txt # Production image: tiny, no build toolsFROM cleanstart/python:3.11-prodWORKDIR /app # Copy installed packages from builderCOPY --from=builder /root/.local /root/.localENV PATH=/root/.local/bin:$PATH # Copy application codeCOPY app.py . # Run as non-root userUSER 65532 # Expose the portEXPOSE 5000 # Health checkHEALTHCHECK --interval=10s --timeout=3s --start-period=5s \  CMD curl -f http://localhost:5000/health || exit 1 # Start the appCMD ["python3", "-m", "flask", "--app", "app", "run", "--host", "0.0.0.0"]

Why this is good:

This Dockerfile uses multi-stage building so the final image has no compiler or build tools. It uses CleanStart base images that are scanned, signed, and hardened. It runs as a non-root user to reduce privilege escalation risk. It includes a health check so orchestrators can detect failures. The final image size is only approximately 300MB.

In Practice

Images are layered where each instruction creates a cached layer that can be reused, and FROM is essential for choosing a base image that balances security and size. Tags are mutable while digests are immutable, so you should use both strategically. Multi-stage builds reduce image size by separating build dependencies from runtime, and registries store and serve images much like GitHub does for code.

Next Steps

Read What is a Container Registry? for registry details. Read What is a CVE? to understand vulnerabilities in images. Try End-to-End Secure Deployment to build your first image.

Common mistakes to avoid: Don't use the latest tag in production since it's mutable, don't create one huge layer instead of many small ones, don't run as root but always use the USER directive, and don't include build tools in the production image but instead use multi-stage builds.