Overview: From Imperative to Declarative
A traditional Dockerfile is imperative: a series of commands that build an image step by step. A CleanStart YAML spec is declarative: a specification of the desired final image, with the build process handled by CleanStart's deterministic build system.
In Dockerfile (Imperative), you specify a sequence of commands: FROM ubuntu:22.04, RUN apt-get update and install packages, RUN pip install, COPY app.py, WORKDIR, EXPOSE, and CMD.
In CleanStart YAML (Declarative), you specify what should be in the final image: base_image, copy_packages, copy_files, working_directory, entrypoint, command, and environment variables.
The key difference is that you specify what should be in the image; CleanStart handles how it gets there. This approach enables reproducible builds, automatic patching, and SLA-backed security guarantees.
Decision Framework: When to Migrate
When to Migrate to CleanStart YAML
Migrate to CleanStart YAML when production security is critical for public APIs and sensitive data environments. Migration is appropriate when image size matters because you're containerizing microservices or serverless functions, when rapid patching is important so you can get CVE patches within hours rather than days, and when reproducibility is required for compliance in financial services or government. Migration also suits standard runtimes like Python, Node.js, Go, or Java, and works best when your image is long-lived rather than being rebuilt multiple times per day from scratch.
When to Keep Dockerfiles
Keep Dockerfiles for development and test images where frequent iteration and rebuild speed matter, for highly custom builds involving your own compiled binaries or specialized toolchains, and for experimental workloads where you're exploring new approaches or prototyping. Dockerfiles are also appropriate for complex compilation steps, custom source builds with patches, and situations where your team has inexperience with CleanStart.
Hybrid Approach
Many organizations migrate critical production images to CleanStart while keeping Dockerfiles for development and non-critical services. This gives you the best of both approaches: security and reproducibility in production, flexibility in development.
Side-by-Side Migration Examples
Example 1: Simple Python Web Application
Original Dockerfile
A generic Python image is used with dependencies installed inline during build.
Migrated CleanStart YAML
A hardened, pre-scanned base image is used with dependencies declared separately and metadata specified as top-level fields.
Key Changes
The base image migrates from python:3.11-slim to cleanstart/python:3.11-prod, which is pre-scanned for vulnerabilities, minimized, and FIPS-ready. RUN commands for dependency installation are eliminated. The pip install is removed since package versions are managed separately. File copying becomes explicit with each file listed. Metadata becomes declarative top-level fields.
Example 2: Multi-Stage Build with Node.js
Original Dockerfile
Multi-stage builds separate the build stage from the runtime stage, reducing final image size.
Migrated CleanStart YAML
CleanStart YAML doesn't support multi-stage builds directly. Instead, the build stage should run locally in your CI/CD pipeline, and the YAML specifies the runtime configuration.
Key Changes
The build stage is removed and executed in CI/CD. The YAML focuses on runtime configuration and security hardening. Node packages that were installed globally become entries in the copy_packages section.
Example 3: Go Microservice with Custom Config
Original Dockerfile
A multi-stage build compiles a Go binary, sets up a non-root user, and includes security configurations.
Migrated CleanStart YAML
The build stage is removed and executed separately in your CI/CD. The YAML focuses on runtime configuration and security hardening.
Key Changes
The build stage is extracted and run separately. User management commands move to the run_commands section. File ownership is set through run commands. Health checks become declarative configuration. Non-root user configuration is specified in a dedicated section.
Example 4: Java Application with Maven
Original Dockerfile
Multi-stage build uses Maven for building and compiles a JAR file.
Migrated CleanStart YAML
The Maven build stage is removed and executed in CI/CD. The YAML specifies the Java runtime and JAR file location.
Key Changes
Maven is removed from the runtime image. The Java runtime is explicitly specified. Environment variables for Java configuration become top-level fields. The multi-stage Docker pattern is completely eliminated.
Example 5: Application with Custom Database Initialization
Original Dockerfile
PostgreSQL setup with initialization scripts.
Migrated CleanStart YAML
The migration maintains the same functionality while using a hardened base image.
Key Changes
The hardened PostgreSQL base image is used. Initialization scripts are copied to the standard location. Standard PostgreSQL environment variables are preserved. Health checks are declared declaratively.
Common Dockerfile Patterns and YAML Equivalents
Pattern 1: Environment Variables
In Dockerfile, ENV defines environment variables. In YAML, the environment section lists environment variables.
Pattern 2: User and Permissions
In Dockerfile, RUN commands create users and change ownership. In YAML, run_commands handle user creation and a dedicated user section specifies UID/GID.
Pattern 3: Port Exposure and Health Checks
In Dockerfile, EXPOSE and HEALTHCHECK are imperative. In YAML, expose_ports and healthcheck sections are declarative.
Pattern 4: Volume Mounts
In Dockerfile, VOLUME creates mounts. In YAML, a volumes section lists mount points with mode.
Pattern 5: Working Directory
In Dockerfile, WORKDIR sets directory. In YAML, working_directory is a top-level field.
Pattern 6: Package Installation
In Dockerfile, RUN with apt-get installs packages. In YAML, copy_packages lists packages for automatic installation and cleanup.
Pattern 7: Build Arguments
In Dockerfile, ARG defines build-time arguments. In YAML, labels section handles metadata and CleanStart automatically adds standard labels.
Pattern 8: Entrypoint with Shell Wrapper
In Dockerfile, COPY and ENTRYPOINT handle shell wrappers. In YAML, copy_files and entrypoint fields handle shell wrapper scripts.
Pattern 9: Security Context
In Dockerfile, RUN commands handle security. In YAML, security_context specifies non-root execution, read-only filesystem, and capability dropping.
Step-by-Step Migration Process
Phase 1: Analyze Your Dockerfile
Understand what your Dockerfile does by listing all RUN commands, COPY instructions, environment variables, and checking for multi-stage builds.
Phase 2: Prepare Your Files
If your Dockerfile has a build stage, run it separately before creating the CleanStart YAML. Pre-build compiled artifacts locally or in CI/CD. Export pinned dependency lists. Organize your project with clear structure.
Phase 3: Write CleanStart YAML Spec
Start with a template, fill in the base image, add package dependencies, copy your files with explicit source and destination, set metadata about the image, and add security context.
Phase 4: Test and Validate
Validate your YAML syntax before building. Build a test image locally. Test the image locally. Verify the entrypoint works. Scan for vulnerabilities before deploying.
Phase 5: Deploy and Monitor
Push the image to your container registry. Deploy to a staging environment. Monitor for errors during deployment. Run your integration test suite. Deploy to production once staging validation is complete.
Handling Special Cases
Compiled Binaries and Custom Tools
If your Dockerfile compiles custom binaries, you must build them separately before creating the YAML. Build locally first, then copy the pre-built binary into the CleanStart YAML.
Complex Build Steps with Multiple Dependencies
For complex build logic involving multiple compilation steps, consider either pre-building everything in CI/CD or using custom run commands.
Secrets Management
Avoid baking secrets into images. Never include API keys or sensitive data. Use BuildKit secrets for build-time access or fetch secrets at runtime.
Fallback: Keeping Your Dockerfile
If you can't fully migrate, use Dockerfile with CleanStart base images to get some security benefits while maintaining flexibility.
Migration Checklist
The following items should be completed during your migration process to ensure a thorough and successful transition from Dockerfile to YAML format:
You should analyze all RUN, COPY, ENV, and LABEL commands in your original Dockerfile. Identify any multi-stage builds and build artifacts separately. Select an appropriate CleanStart base image that matches your runtime requirements. Create a complete list of all package dependencies needed at runtime. Document all files to copy with their source and destination paths. Configure the working_directory, entrypoint, and command fields appropriately. Add all required environment variables to the YAML specification. Configure the security context with appropriate user permissions and filesystem settings. Set up health checks if your application requires them. Validate the YAML syntax before attempting to build the image. Test the image locally using docker run to verify functionality. Scan the built image for vulnerabilities using CleanStart tools. Deploy the image to a staging environment for integration testing. Run your complete integration test suite against the staged image. Compare the original and migrated images for size, functionality, and performance. Document any special configurations or deviations from standard patterns.
Verification: Comparing Original and Migrated Images
Once migrated, verify that the new image works correctly by comparing behavior.
You can build the original image, run it to capture version and filesystem information, build the migrated image, run it to capture the same information, compare the outputs, and check image sizes to verify the migrated image is smaller and more secure.
Next Steps
Understand the benefits of migrated images and how to maintain them. Build Stage Security, The 11 Build Artifacts, Helm Charts and Kubernetes, and Reproducible Builds.
Learn how to manage migrated images at scale. The Continuous Trust Loop and Supply Chain DR Plan.
