Knowledge Hub

Shell-Less Containers and How Initialization Works

Initialization Without Shells: How cleanimg-init Replaces Shell Scripts

Shell-less containers eliminate entire attack vectors by removing the shell binary and all shell utilities, but they require a different approach to initialization. Shell scripts are impossible in these environments, so initialization must happen through binary tools designed to run in isolated, shell-less environments. This represents a fundamental shift in how container initialization works and why it matters for security.

graph TB    Traditional["Traditional<br/>Shell-based<br/>Init Scripts<br/>bash/sh"]    Traditional -->|Problems| Issues["Bloated images<br/>Complex logic<br/>Shell exploits<br/>Slow startup"]    ShellLess["Shell-Less<br/>Binary Init<br/>cleanimg-init<br/>Direct execution"]    ShellLess -->|Benefits| Benefits["Minimal images<br/>Fast startup<br/>No shell attacks<br/>Simple & safe"]    style Issues fill:#ffcccc    style Benefits fill:#ccffcc

Part 1: The Shell-Less Contract

What "No Shell" Means Precisely

A shell-less container strictly enforces the absence of shell interpreters and related utilities. The container must not contain /bin/sh (the POSIX shell), /bin/bash (bash), /bin/dash (dash), /bin/ash (busybox shell), /bin/zsh (zsh), or any other shell variant. This is not merely a configuration choice; the shells must be completely absent from the image.

Beyond just shells, shell-less containers exclude shell utilities entirely. The env command for environment variable manipulation is absent. The test command and [ operator used for conditionals are missing. Text processing utilities like find, grep, sed, and awk are removed. Control flow constructs like if, while, and for cannot exist in shell form.

What remains is only what's necessary for the application to function. The application binary itself (whether it's python, node, java, etc.) is included along with its runtime libraries. CleanStart images include the GLIBC C standard library, OpenSSL or BoringSSL for TLS operations, libz for compression, and CA certificate bundles for validating HTTPS connections. The cleanimg-init binary handles initialization and signal management. CA certificates are included for TLS connections to external services. Nothing else.

What IS in a CleanStart -prod Image

A CleanStart production image contains a carefully curated set of components. Base OS components include GLIBC (the C standard library), OpenSSL/BoringSSL (for TLS/SSL operations), libz (for compression), and a CA certificate bundle at /etc/ssl/certs/ca-bundle.crt. The runtime environment includes the appropriate interpreter (Python 3.11 if it's a Python image), pip or the equivalent package manager, essential libraries for the runtime, and virtual environment tools. The initialization system includes cleanimg-init (a compiled Go binary approximately 10MB), configuration parsing logic, signal handlers for graceful shutdown, and health check capabilities. Critically, there is nothing shell-like: no /bin/sh, no /bin/bash, no entrypoint.sh, and no shell-based init scripts.

What Is NOT There (and Why That Matters)

The absence of a shell command interpreter fundamentally changes the threat model. An attacker cannot open an interactive shell, making kubectl exec fail with the clear message "exec: sh: not found". Reverse shells (curl | bash) cannot run because bash doesn't exist. Command injection attacks (e.g., in SQL: '; DROP TABLE; --) cannot execute system commands because there's no shell to interpret them.

Interactive tools being absent means an attacker cannot manually debug or modify the application even if they gain code execution. There's no way to install backdoors or rootkits at runtime because package managers require shells to function. The lack of apt, yum, or pip execution prevents runtime modifications.

When read-only filesystems are combined with shell-less design, the impact is exponential. Even if an attacker gains code execution, they cannot write to the filesystem to create log files, config files, or malware. They can only write to explicitly mounted volumes like /tmp.

This matters critically for supply chain security. The 2020 SolarWinds compromise started with a single shell command injection in a build script. Shell-less containers prevent this class of attack entirely. Every shell eliminated is an attack surface reduced, making supply chain security tangible rather than theoretical.

Part 2: Why Traditional Init Requires Shell (and Why That's a Problem)

The docker-entrypoint.sh Pattern

Traditional containerized applications use shell-based entry point scripts that have become industry standard. A typical entrypoint script creates required directories, sets ownership and permissions, optionally initializes databases, waits for dependent services, runs migrations or setup scripts, and executes the application while ensuring it becomes PID 1.

This pattern is needed because applications can't create their own directories at startup, Kubernetes expects PID 1 to be the actual application (not a wrapper), and complex initialization logic doesn't fit in a single command. However, this approach creates security risks.

Shell injection vulnerabilities arise when environment variables are substituted unsafely. A vulnerable pattern substitutes a custom init command from an untrusted source, and an attacker sets CUSTOM_INIT to "rm -rf /" during build time. Command injection via configuration files occurs when the script sources a configuration file from an untrusted location.

Dependency confusion and substitution vulnerabilities occur when scripts invoke commands expected to be at specific paths but attackers create malicious versions in earlier PATH directories. Signal handling failures are common in naive shell scripts that don't properly forward signals like SIGTERM to subprocess, causing zombie processes and pod termination timeouts.

The SolarWinds Precedent demonstrated this danger vividly. In December 2019, attackers compromised SolarWinds' build system and injected code checking if the build number matched 12345, then executing malicious commands during that specific build. Shell-based build scripts are common targets because they're readable, modifiable, and rarely audited thoroughly. This incident affected 18,000+ organizations.

The Modern Lesson is clear: every shell in the supply chain (build, initialization, deployment) is an attack surface. The industry has learned to eliminate shell scripts from the build pipeline in favor of compiled languages or pure configuration, eliminate shell scripts from runtime initialization through init systems like cleanimg-init, and eliminate shell from container images entirely.

Part 3: How cleanimg-init Replaces Shell-Based Initialization

The Initialization Model: Before and After

cleanimg-init is a compiled, signed, auditable Go binary that performs everything a shell script can do for initialization without the attack surface of a shell. Its configuration is declarative rather than imperative, making it easier to understand intent. It's type-safe, validating configuration at parse time rather than encountering errors at runtime. Signal handling is explicit and reliable.

Example 1: PostgreSQL Initialization

Traditional Approach: A shell script creates directories, sets ownership, initializes the database if needed, ensures the socket directory exists, and waits for dependencies. This approach requires /bin/bash in the image and relies on implicit shell semantics for error handling.

CleanStart Approach: Configuration in TOML format declares the user to run as (postgres), the directories to create with specific permissions (/var/lib/postgresql/data with 0700), conditional initialization (only if PG_VERSION doesn't exist), dependency waiting (optional TCP connection check with 120 second timeout), and health check configuration (TCP on localhost:5432).

The Dockerfile becomes simpler: FROM cleanstart/postgresql:15-prod, COPY the configuration file, and SET ENTRYPOINT to /opt/cleanimg/cleanimg-init.

Advantages: No shell required, declarative configuration makes intent explicit, type-safe file paths and UIDs validated at parse time, signal handling guaranteed reliable for graceful shutdown.

Example 2: Redis Initialization

Traditional shell approach creates data directory, ensures it's writable by the redis user, copies default configuration if missing, and starts Redis. CleanStart approach uses TOML to declare user (redis), directory creation with permissions, conditional file copy, and application startup.

Example 3: Kafka Initialization

Traditional approach creates log directories, generates broker ID from hostname using shell substring operations, generates configuration from template with environment variable substitution, and starts Kafka. CleanStart approach uses TOML to declare directory creation, template rendering with context variables (pod_ordinal), and application startup with arguments.

Example 4: Python/Node/Java Application

Traditional approach creates log directories, waits for database readiness using curl, runs database migrations, and starts the application. CleanStart approach uses TOML to declare directory creation, HTTP waiting with timeout, migration execution as a task, and application startup.

Part 4: The ENTRYPOINT Contract

exec Form vs. Shell Form

Kubernetes and container runtimes support two ways to specify ENTRYPOINT, with critical security implications.

Shell Form (ENTRYPOINT postgres -D /var/lib/postgresql/data) is problematic because the runtime executes /bin/sh -c "postgres -D /var/lib/postgresql/data", making the shell PID 1 instead of postgres. When SIGTERM arrives for graceful shutdown, the shell may not forward it correctly. Graceful shutdown fails, and pod termination times out waiting for SIGKILL. This form violates the shell-less contract and introduces signal handling risks.

Exec Form (ENTRYPOINT ["/opt/cleanimg/cleanimg-init"]) executes the binary directly without shell wrapper. The binary becomes PID 1, signals are sent directly, and graceful shutdown works as intended.

cleanimg-init as ENTRYPOINT

Using cleanimg-init as ENTRYPOINT follows a clean execution model. The container starts, the kernel executes /opt/cleanimg/cleanimg-init, cleanimg-init becomes PID 1, it reads configuration from /etc/cleanimg/cleanimg-init.toml, it executes initialization tasks (mkdir, copy, template, wait, etc.), then it execs the application binary (postgres, redis, python, etc.) which becomes the new PID 1, and cleanimg-init is replaced. The application handles signals directly since it's now PID 1.

Key to this approach is that cleanimg-init uses exec to replace itself with the application. The application becomes PID 1, no extra wrapper process remains, and signals go directly to the application.

Application as ENTRYPOINT (Alternative)

When no initialization is needed, the application can be ENTRYPOINT directly (e.g., ENTRYPOINT ["python", "-m", "uvicorn", "main:app"]). This is simpler with fewer moving parts and the application is immediately PID 1. The trade-off is that no initialization can happen (can't mkdir, copy, wait, template). Use this only for stateless applications with zero initialization overhead.

Signal Chain: SIGTERM → cleanimg-init → Application → Graceful Shutdown

When Kubernetes sends SIGTERM to a pod (with default 30-second terminationGracePeriodSeconds), the signal arrives at cleanimg-init PID 1, which has registered a SIGTERM handler that forwards to the application. The application receives SIGTERM, initiates graceful shutdown (closing connections, flushing buffers, syncing data), and exits within 5-10 seconds typically. cleanimg-init detects the child process exited and exits itself. The container terminates normally.

Without proper signal handling (shell-based entry), the shell becomes PID 1, SIGTERM arrives at the shell, the shell may not forward to postgres, postgres may not receive SIGTERM, and the pod hangs until SIGKILL is sent.

With cleanimg-init, SIGTERM delivery to the application is guaranteed, application exits cleanly, and pod terminates on time.

Part 5: Kubernetes initContainers as Shell Replacement

The initContainer Pattern

Kubernetes supports initContainers: containers that run to completion before the main container starts. These are ideal for database migrations, configuration generation, dependency checks, and data preparation.

A complete example shows a Python app deployment with migration and wait initContainers. The migrate initContainer runs migrations to completion, the wait-deps initContainer ensures postgres is ready, and the main container starts only after dependencies are satisfied.

Flow: Kubernetes schedules the pod, wait-deps initContainer starts and cleanimg-init waits for postgres:5432, wait completes (or times out), migrate initContainer starts and runs migrations, if migration succeeds initContainer exits and main container starts, if migration fails pod enters CrashLoopBackOff.

Advantages: Separation of concerns (init logic separate from app), resource efficiency (init containers consume resources only until they exit), failure handling (if init fails, pod doesn't start), auditability (init logic visible in pod spec), no shell required (init containers are also shell-less).

Part 6: Health Checks Without Shell

Why Shell-Based Health Probes Don't Work

Kubernetes supports three probe types. Exec Probes require executing commands like sh -c "curl http://localhost:8000/health", which fails in shell-less containers with "exec: sh: not found". Never use exec probes in shell-less containers.

HTTP Probes (Recommended) make HTTP GET requests to http://localhost:8000/health without shell. Application must implement the health endpoint. Endpoint should return 200 if healthy, 503 if unhealthy, completing in under 1 second.

TCP Socket Probes attempt opening TCP sockets to localhost:5432. If successful, the service is assumed healthy. No command execution or shell required. PostgreSQL, Redis, Kafka, MySQL, and any TCP-listening service supports this.

Complete Health Check Examples

PostgreSQL uses TCP probes on port 5432. Python web apps use HTTP probes on /health/live and /health/ready endpoints. Redis uses TCP probes on port 6379. Proper probe configuration includes initialDelaySeconds (how long to wait before first probe), periodSeconds (how often to check), timeoutSeconds (how long to wait for response), and failureThreshold (how many failures before action).

cleanimg-init can expose a health check endpoint at a configurable port and path, allowing probing cleanimg-init directly rather than the application.

Part 7: Troubleshooting Shell-Less Containers

Error: "exec: sh: not found"

This error occurs when attempting kubectl exec -it postgres-0 -- sh on a shell-less container. The solution is to stop trying to get interactive shells. Instead, use kubectl logs for log viewing, kubectl describe pod for event viewing, health probes or HTTP endpoints for diagnosis, and kubectl exec -it postgres-0 -- /opt/cleanimg/cleanimg-init healthcheck if you must exec.

Error: "Exec probe failed"

This error occurs when health probes try to execute commands like curl that don't exist in shell-less images. The solution is to use HTTP probes instead of exec probes.

Error: "Probe timeout during startup"

If application isn't starting, initialDelaySeconds is too short for startup or cleanimg-init initialization. Increase initialDelaySeconds from 5 to 30 or higher.

Summary: Shell-Less + cleanimg-init

Shell-less containers with cleanimg-init represent the modern production standard. The combination eliminates entire categories of supply chain attacks while maintaining full initialization capability.

Aspect	Traditional (Shell)	CleanStart (Shell-Less)
ENTRYPOINT	`/bin/bash docker-entrypoint.sh`	`/opt/cleanimg/cleanimg-init`
PID 1	Shell (risky)	cleanimg-init or Application
Signal handling	Implicit, fragile	Explicit, reliable
Initialization	Bash scripts	TOML configuration
Health checks	Exec probes (need curl/sh)	HTTP or TCP probes
Attack surface	Shell + script	cleanimg-init binary only
Auditability	Shell semantics are complex	Configuration is declarative
Maintainability	Bash changes slowly	cleanimg-init versions
Compliance	Difficult (shell is attack vector)	Easy (shell-less required for standards)

Shell-less containers with cleanimg-init are the modern production standard, eliminating an entire category of supply chain attacks while maintaining full initialization capability.