Difficulty: Intermediate | Time: 30 minutes | Focus: Read-only FS, security hardening, tmpfs mounts
graph TB Step1["Step 1: Understand<br/>Read-Only Benefits<br/>Tamper detection<br/>Drift prevention"] Step2["Step 2: Run Container<br/>readOnlyRootFilesystem<br/>Observe failures"] Step3["Step 3: Add tmpfs<br/>For /tmp & /var"] Step4["Step 4: Fix<br/>Application errors"] Step5["Step 5: Deploy to K8s<br/>SecurityContext"] Step6["Step 6: Verify<br/>Non-root UID"] Step1 --> Step2 Step2 --> Step3 Step3 --> Step4 Step4 --> Step5 Step5 --> Step6 Step6 --> Result["Hardened<br/>Read-Only<br/>Container"] style Result fill:#c8e6c9Objectives
By the end of this lab, you will understand and be able to run containers with read-only root filesystems. You will comprehend why read-only filesystems provide better security properties and identify what breaks when you enable read-only mode. You will learn how to fix issues by adding tmpfs mounts for directories that need to be writable, deploy fully read-only containers to Kubernetes, and verify that applications execute as non-root users (using UID 65532, the CleanStart convention).
Prerequisites
Required: You will need Docker 20.10 or newer, a Linux operating system (read-only FS requires Linux kernel features), Bash or a compatible shell, and the docker inspect and docker run commands.
Optional: Optionally, you may want kubectl 1.27+ (for the Kubernetes section) and a local Kubernetes cluster (kind, minikube) for testing.
Verify setup:
docker --versionuname -a # Should show LinuxBackground: Read-Only Root Filesystem
A read-only root filesystem prevents attackers from modifying the system files, ensuring system integrity even if application code is compromised. However, many applications still need to write to temporary locations, including /tmp for temporary files and /var/tmp for additional temporary storage. Applications also need space for cache directories to optimize performance and log directories for operation tracking. For these scenarios, tmpfs mounts are used, which are in-memory and non-persistent, allowing the application to write data that exists only during the container's lifetime.
Step 1: Create Working Directory and Application
mkdir -p ~/labs/lab-04-read-only-filesystemcd ~/labs/lab-04-read-only-filesystemCreate a Python application that requires temporary storage:
cat > app.py << 'EOF'import osimport jsonimport tempfilefrom pathlib import Path def main(): print("=== CleanStart Lab 04: Read-Only Filesystem ===") # Test 1: Try to write to /tmp try: with tempfile.NamedTemporaryFile(mode='w', delete=False) as f: f.write(json.dumps({"test": "data"})) f.flush() print(f"✅ Successfully wrote to {f.name}") os.unlink(f.name) except Exception as e: print(f"❌ Failed to write to /tmp: {e}") # Test 2: Try to create a cache directory try: cache_dir = Path("/tmp/app-cache") cache_dir.mkdir(parents=True, exist_ok=True) cache_file = cache_dir / "cache.json" cache_file.write_text(json.dumps({"cached": "value"})) print(f"✅ Successfully wrote to {cache_file}") except Exception as e: print(f"❌ Failed to create cache: {e}") # Test 3: Check current user uid = os.getuid() print(f"ℹ️ Running as UID: {uid}") if uid == 0: print("⚠️ WARNING: Running as root (UID 0)") elif uid == 65532: print("✅ Running as non-root (UID 65532 - distroless convention)") else: print(f"✅ Running as non-root (UID {uid})") # Test 4: Check filesystem mount options try: with open('/proc/mounts', 'r') as f: for line in f: if line.startswith('/dev/') and '/' in line: parts = line.split() mount_point = parts[1] options = parts[3] if 'ro' in options: print(f"ℹ️ {mount_point} is read-only") except Exception as e: print(f"Could not read /proc/mounts: {e}") print("\nAll tests complete!") if __name__ == '__main__': main()EOFCreate a Dockerfile:
cat > Dockerfile << 'EOF'FROM registry.cleanstart.com/cleanstart/python:3.12 WORKDIR /app COPY app.py . CMD ["python", "app.py"]EOFStep 2: Build the Image
docker build -t lab-04-readonly:latest .Expected output:
[+] Building 5.3s (6/6) FINISHED...Step 3: Run Container WITHOUT Read-Only (Baseline)
Run the container normally (writable filesystem):
docker run --rm lab-04-readonly:latestExpected output:
=== CleanStart Lab 04: Read-Only Filesystem ===✅ Successfully wrote to /tmp/tmpXXXXXX✅ Successfully wrote to /tmp/app-cache/cache.jsonℹ️ Running as UID: 65532✅ Running as non-root (UID 65532 - distroless convention)ℹ️ / is read-onlyℹ️ /tmp is tmpfs (temporary) All tests complete!Key observations: Writing to /tmp succeeds, and the process is running as UID 65532 (non-root). The root filesystem is already read-only in CleanStart images.
Step 4: Run Container WITH Read-Only Flag (Breaks)
Run with --read-only flag:
docker run --rm --read-only lab-04-readonly:latestExpected output (abbreviated):
=== CleanStart Lab 04: Read-Only Filesystem ===❌ Failed to write to /tmp: [Errno 30] Read-only file system❌ Failed to create cache: [Errno 30] Read-only file systemℹ️ Running as UID: 65532...Problem: Even /tmp is now read-only because we specified --read-only, which makes ALL filesystem mounts read-only.
Step 5: Add tmpfs Mount for /tmp (Fix)
Add a tmpfs mount for /tmp:
docker run --rm --read-only --tmpfs /tmp lab-04-readonly:latestExpected output:
=== CleanStart Lab 04: Read-Only Filesystem ===✅ Successfully wrote to /tmp/tmpXXXXXX✅ Successfully wrote to /tmp/app-cache/cache.jsonℹ️ Running as UID: 65532✅ Running as non-root (UID 65532 - distroless convention)... All tests complete!Success: Writing to /tmp works again because we added a writable tmpfs mount.
Step 6: Run with Multiple tmpfs Mounts
For applications needing multiple writable directories, add multiple tmpfs mounts:
docker run --rm --read-only \ --tmpfs /tmp:size=100m \ --tmpfs /run:size=50m \ --mount type=tmpfs,destination=/var/tmp,tmpfs-size=100m \ lab-04-readonly:latestExpected output:
✅ Successfully wrote to /tmp/tmpXXXXXX✅ Successfully wrote to /tmp/app-cache/cache.json...All tests complete!Options explained: The --tmpfs /tmp:size=100m option creates a tmpfs mount at /tmp with a 100MB size limit. The --tmpfs /run:size=50m option creates a tmpfs mount at /run with a 50MB size limit. The --mount type=tmpfs,... provides an alternative syntax for more control over the mount configuration.
Step 7: Create a Docker Compose File for Read-Only Container
Create docker-compose.yml:
cat > docker-compose.yml << 'EOF'version: '3.8' services: app: image: lab-04-readonly:latest read_only: true tmpfs: - /tmp:size=100m - /run:size=50m environment: - PORT=8000 ports: - "8000:8000" cap_drop: - ALL security_opt: - no-new-privileges:trueEOFDeploy with Docker Compose:
docker-compose up --detachExpected output:
Creating network "lab-04-read-only-filesystem_default" with the default driverCreating lab-04-read-only-filesystem_app_1 ... doneVerify it's running:
docker-compose psExpected output:
NAME COMMAND STATUS PORTSlab-04-read-only-filesystem_app_1 "python app.py" Up 2 seconds 0.0.0.0:8000->8000/tcpStop the container:
docker-compose downStep 8: Create a Kubernetes Deployment with Read-Only FS
Create kubernetes-deployment.yaml:
cat > kubernetes-deployment.yaml << 'EOF'apiVersion: v1kind: Namespacemetadata: name: lab-04 ---apiVersion: apps/v1kind: Deploymentmetadata: name: lab-04-readonly namespace: lab-04spec: replicas: 1 selector: matchLabels: app: lab-04-readonly template: metadata: labels: app: lab-04-readonly spec: securityContext: runAsNonRoot: true runAsUser: 65532 fsGroup: 65532 containers: - name: app image: lab-04-readonly:latest imagePullPolicy: IfNotPresent securityContext: allowPrivilegeEscalation: false readOnlyRootFilesystem: true runAsNonRoot: true runAsUser: 65532 capabilities: drop: - ALL volumeMounts: - name: tmp mountPath: /tmp - name: run mountPath: /run resources: requests: memory: "64Mi" cpu: "100m" limits: memory: "128Mi" cpu: "500m" volumes: - name: tmp emptyDir: medium: Memory sizeLimit: 100Mi - name: run emptyDir: medium: Memory sizeLimit: 50Mi ---apiVersion: v1kind: Servicemetadata: name: lab-04-readonly namespace: lab-04spec: selector: app: lab-04-readonly ports: - protocol: TCP port: 8000 targetPort: 8000 type: ClusterIPEOFKey security settings: The readOnlyRootFilesystem: true setting enables a read-only root, while runAsNonRoot: true ensures a non-root user is used. The runAsUser: 65532 setting specifies the UID (following the CleanStart convention). The capabilities.drop: [ALL] drops all Linux capabilities, allowPrivilegeEscalation: false prevents gaining privileges, and volumes with emptyDir and medium: Memory provide tmpfs mounts.
Step 9: Test with Kubernetes (Optional)
If you have a Kubernetes cluster (kind, minikube):
# Create a kind cluster if you don't have onekind create cluster --name lab-04 # Load the image into the clusterkind load docker-image lab-04-readonly:latest --name lab-04 # Apply the deploymentkubectl apply -f kubernetes-deployment.yaml # Wait for pod to be readykubectl wait --for=condition=ready pod \ -l app=lab-04-readonly \ -n lab-04 \ --timeout=60s # Check pod statuskubectl get pods -n lab-04 # View logskubectl logs -n lab-04 -l app=lab-04-readonly # Cleanupkubectl delete namespace lab-04kind delete cluster --name lab-04Step 10: Create a Security Hardening Checklist
Create HARDENING_CHECKLIST.md to verify all security controls are in place. The container runtime checks should verify whether the container runs with the --read-only flag, whether tmpfs mounts are added for /tmp and application cache directories, whether the container runs as non-root (UID 65532), and whether no shell access is available. Kubernetes checks should verify that the pod has readOnlyRootFilesystem: true, runAsNonRoot: true, runAsUser: 65532, capabilities.drop: [ALL], allowPrivilegeEscalation: false, uses emptyDir with medium: Memory for writable volumes, and has volume size limits set. Docker Compose checks should verify the service has read_only: true, tmpfs mounts configured, cap_drop: [ALL] configured, and no-new-privileges: true set. Testing checks should verify that the application writes to /tmp successfully with read-only FS, writes to cache directories successfully, fails when attempting to modify the root filesystem, and shows UID 65532 (non-root) in logs. Document the security benefits: immutability prevents attackers from modifying system files, no persistence prevents malware from installing backdoors, audit trails track all write attempts, reduced attack surface limits the impact of compromise, and least privilege limits the damage from any compromise.
Step 11: Verify Non-Root Execution
Confirm the container runs as non-root by default:
docker run --rm lab-04-readonly:latest idExpected output:
uid=65532 gid=65532 groups=65532The CleanStart base image uses UID 65532 (distroless convention) by default.
Verification Checklist
Confirm all of the following are complete: the directory ~/labs/lab-04-read-only-filesystem has been created, app.py has been created with tmpfs write tests, Dockerfile has been created using the CleanStart base image, the Docker image has been built successfully, the container runs successfully without the --read-only flag, the container fails to write to /tmp with the --read-only flag alone, the container succeeds when --tmpfs /tmp is added, the container runs successfully with multiple tmpfs mounts configured, docker-compose.yml has been created with read-only configuration, kubernetes-deployment.yaml has been created with appropriate security context, HARDENING_CHECKLIST.md has been created, and the command docker run --rm lab-04-readonly:latest id shows UID 65532 (non-root execution).
If all items are checked, you have successfully completed Lab 04.
What You Learned
This lab taught you several important security concepts. Read-Only Root Filesystem prevents system modifications and improves overall security posture by ensuring attackers cannot persist changes. tmpfs Mounts provide in-memory, non-persistent writable directories for temporary data that disappears when the container stops. UID 65532 is the Distroless convention for non-root applications (compared to arbitrary UIDs that might have unintended permissions). Security Context is the Kubernetes mechanism for configuring pod security through API fields. Least Privilege means running applications as non-root with dropped capabilities to minimize damage from compromise. Volume Strategies using EmptyDir with Memory medium provide secure temporary storage without persistent disk. Multi-Layer Defense combines read-only filesystem, non-root execution, and capability dropping to create multiple independent security controls that all must be bypassed to compromise the system.
Cleanup
Stop Docker Compose (if running):
docker-compose down 2>/dev/null || trueRemove images and lab directory:
docker rmi lab-04-readonly:latestrm -rf ~/labs/lab-04-read-only-filesystemNext Lab
Proceed to Lab 05: Kubernetes Deployment to deploy a complete application to Kubernetes with security best practices.
Real-World Application
In production environments, you should always use read-only root filesystem as a baseline security control. Add tmpfs mounts for known writable directories rather than allowing any writes to persistent storage, run all containers as non-root to limit privilege escalation impact, and drop all unnecessary Linux capabilities to reduce the available attack surface. Use Kubernetes security contexts to enforce these policies consistently across your cluster, and monitor write attempts to detect abnormal behavior that might indicate a compromised container.
Estimated Time: 30 minutes | Hands-on: ~25 minutes | Reading: ~5 minutes
