Purpose
Every time you upgrade a CleanStart base image version—whether a patch, minor, or major version bump—your applications need verification that functionality remains intact. This document provides a systematic, layered regression testing strategy that ensures production readiness while avoiding unnecessary test delays.
Target audience: QA leads, DevOps engineers, platform teams responsible for upgrading container images in production.
Expected testing time: 15 minutes (patch), 45 minutes (minor), 2 hours (major), 24–72 hours (major + soak).
When Regression Testing is Required
Regression testing must run before promoting a new image version to production, with scope determined by upgrade type.
Mandatory Regression Testing Triggers
Trigger | Scope | Timeline |
|---|---|---|
Monthly scheduled patch (e.g., py3.12.1 → py3.12.2) | Smoke + security verification | 15 min |
Minor version bump (e.g., py3.12 → py3.13, node18 → node20) | Full regression suite | 45–60 min |
Major version upgrade (e.g., Python 3 → 4, Node 18 → 22) | Full regression + soak test | 24–72 hrs |
Security hotfix release (emergency CVE patch) | Targeted tests only | 20 min |
Custom image build (first deployment) | Full regression | 60 min |
GLIBC-based distro update (Ubuntu 22.04 → 24.04) | Full regression + extended soak | 48+ hrs |
Skippable Scenarios
Same version, same build date: No testing required (already verified). Dependency update without version change: Smoke tests only. Documentation-only update: No testing required. Third-party library patch (with version lock): Smoke tests only.
Risk Assessment Matrix by Upgrade Type
This matrix guides how much testing to run based on the change risk profile.
Upgrade Type | Risk Level | Regression Scope | Testing Hours | Go/No-Go Criteria |
|---|---|---|---|---|
Security patch(py3.12.5 → py3.12.6) | 🟢 Low | Smoke + security scan | 0.25 | Zero new vulns, startup OK |
Maintenance patch(py3.12.1 → py3.12.2) | 🟢 Low | Smoke + perf baseline | 0.5 | No perf regression >10% |
Minor version bump(py3.12 → py3.13, node18 → node20) | 🟡 Medium | Full functional regression | 1 | All functional tests pass |
Major version bump(Python 3 → 4, Node 18 → Node 22) | 🔴 High | Full regression + 24–72hr soak | 8–16 | All metrics within 15% baseline |
Distro upgrade(Ubuntu 22.04 → 24.04, Alpine 3.19 → 3.20) | 🔴 High | Full regression + extended soak | 12–24 | Memory leak checks, no stdlib incompatibilities |
Emergency hotfix | 🟡 Medium | Targeted regression only | 1–2 | Hotfix issue resolved, no new regressions |
Regression Test Suite Structure: Six-Layer Approach
Regression testing proceeds through six automated and manual layers, each with specific objectives and time budgets. Most upgrades stop at Layer 3 or 4; major version changes proceed through Layer 6.
Layer 1: Image Diff Analysis (2 minutes)
Objective: Understand what changed before running any tests.
Before executing tests, analyze the differences between old and new image versions. This prevents surprises and focuses testing on what actually changed.
What to Check
- Package manifest comparison Which packages were added, removed, or updated?. Are there critical package changes (different version of OpenSSL, glibc, etc.)?.
- SBOM comparison (Software Bill of Materials) Old image SBOM vs. new image SBOM. Any unexpected new dependencies?. Any removed dependencies your app depends on?.
- Vulnerability delta Old image security scan results vs. new. Are there fewer vulnerabilities? (expected). Any new vulnerabilities? (red flag—likely a packaging error).
- Base layer changes Did the underlying GLIBC version change?. Did OpenSSL or other core library versions update?.
Tools and Commands
Pull and inspect both images:
# Get image digestsOLD_DIGEST=$(docker pull registry.cleanstart.com/py:3.12.1 2>&1 | grep "Digest:" | awk '{print $2}')NEW_DIGEST=$(docker pull registry.cleanstart.com/py:3.12.2 2>&1 | grep "Digest:" | awk '{print $2}') echo "Old: $OLD_DIGEST"echo "New: $NEW_DIGEST"Extract and compare SBOMs:
# Extract SBOM from image (Cosign + SPDX format)cosign blob-url registry.cleanstart.com/py:3.12.1 sbom > old-sbom.jsoncosign blob-url registry.cleanstart.com/py:3.12.2 sbom > new-sbom.json # Compare package listsjq '.components[] | .name + ":" + .version' old-sbom.json | sort > old-packages.txtjq '.components[] | .name + ":" + .version' new-sbom.json | sort > new-packages.txt diff old-packages.txt new-packages.txtScan for vulnerabilities:
# If you use Grype or Trivygrype registry.cleanstart.com/py:3.12.1 --format json > old-vulns.jsongrype registry.cleanstart.com/py:3.12.2 --format json > new-vulns.json # Compare critical/high findingsjq '.matches[] | select(.vulnerability.severity == "Critical" or .vulnerability.severity == "High")' old-vulns.json | wc -ljq '.matches[] | select(.vulnerability.severity == "Critical" or .vulnerability.severity == "High")' new-vulns.json | wc -lPass/Fail Criteria
✅ PASS: New image has fewer or equal vulnerabilities; new critical/high findings only expected for intentional major upgrades. ❌ FAIL: New image has new critical/high vulnerabilities not expected. ⚠️ WARN: Unexpected package removals; proceed with caution to Layer 2.
Layer 2: Smoke Tests (5 minutes)
Objective: Verify the container starts, health checks pass, and basic connectivity works.
Smoke tests are the fastest way to catch fatal incompatibilities (missing libraries, broken entrypoints, network issues).
Smoke Test Checklist
[ ] Container starts without errors (docker run). [ ] No segmentation faults or core dumps in startup logs. [ ] Application binaries are present and executable (e.g., python, node, java). [ ] Health check endpoint responds (HTTP 200 or equivalent) within 30 seconds. [ ] Can connect to required external services (database, cache, message queue). [ ] No "library not found" or linker errors in logs. [ ] Application responds to at least one real request.
Smoke Test Script (Generic)
#!/bin/bashset -e IMAGE=$1CONTAINER_NAME="smoke-test-$(date +%s)" echo "🧪 Starting smoke test for $IMAGE..." # Start containerdocker run -d \ --name "$CONTAINER_NAME" \ --health-cmd='curl -f http://localhost:8080/health || exit 1' \ --health-interval=5s \ --health-timeout=3s \ --health-retries=3 \ -p 8080:8080 \ "$IMAGE" & CONTAINER_ID=$!sleep 2 # Check startup logsif docker logs "$CONTAINER_NAME" | grep -i "error\|segfault\|panic"; then echo "❌ FAIL: Errors in startup logs" docker logs "$CONTAINER_NAME" docker rm -f "$CONTAINER_NAME" exit 1fi # Wait for health checkfor i in {1..30}; do if docker inspect "$CONTAINER_NAME" | grep '"Status": "healthy"' > /dev/null; then echo "✅ Health check passed" break fi if [ $i -eq 30 ]; then echo "❌ FAIL: Health check timeout" docker logs "$CONTAINER_NAME" docker rm -f "$CONTAINER_NAME" exit 1 fi sleep 1done # Test basic requestRESPONSE=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:8080/health)if [ "$RESPONSE" != "200" ]; then echo "❌ FAIL: Health endpoint returned $RESPONSE" docker logs "$CONTAINER_NAME" docker rm -f "$CONTAINER_NAME" exit 1fi echo "✅ Smoke test PASSED"docker rm -f "$CONTAINER_NAME"exit 0Pass/Fail Criteria
✅ PASS: Container starts, health check passes, no fatal errors. ❌ FAIL: Any startup error, health check timeout, missing core binaries. Action on FAIL: Stop testing, report to image maintainer.
Layer 3: Functional Regression Suite (15–30 minutes)
Objective: Run the full application test suite against the new image.
This is your main regression test battery. Execute all automated tests that validate application behavior.
What Tests to Include
- API endpoint tests (all CRUD operations)
- Database integration tests (query performance, connection pooling)
- Authentication and authorization tests
- Data pipeline tests (ETL, streaming, batch processing)
- File I/O tests (read, write, permissions)
- Cache integration tests (Redis, Memcached)
- Scheduled job execution tests
- Error handling tests (expected exceptions, graceful degradation)
- Concurrent request tests (thread safety, race conditions)
- Backward compatibility tests (old data format support)
Execution Pattern
#!/bin/bashset -e NEW_IMAGE=$1TEST_CONTAINER="test-$(date +%s)" # Start container with test-friendly environmentdocker run -d \ --name "$TEST_CONTAINER" \ -e "NODE_ENV=test" \ -e "LOG_LEVEL=debug" \ -p 8080:8080 \ --network test-network \ "$NEW_IMAGE" & sleep 5 # Wait for startup # Run test suitedocker exec "$TEST_CONTAINER" \ /bin/sh -c "cd /app && npm test -- --coverage --reporter=json > /tmp/test-results.json" # Copy resultsdocker cp "$TEST_CONTAINER":/tmp/test-results.json ./test-results-new.json # Parse resultsPASS=$(jq '.numPassedTests' test-results-new.json)FAIL=$(jq '.numFailedTests' test-results-new.json)SKIPPED=$(jq '.numPendingTests' test-results-new.json) echo "Test Results: $PASS passed, $FAIL failed, $SKIPPED skipped" # Cleanupdocker rm -f "$TEST_CONTAINER" if [ "$FAIL" -gt 0 ]; then echo "❌ FAIL: $FAIL test(s) failed" jq '.testResults[]' test-results-new.json exit 1fi echo "✅ Functional regression PASSED"exit 0Pass/Fail Criteria
✅ PASS: >95% of tests pass; failures are pre-known/skipped for upgrade. ⚠️ WARN: 90–95% pass rate; investigate failures before proceeding. ❌ FAIL: <90% pass rate or new failures not explained. Action on WARN/FAIL: Review failed test logs, assess risk, consider rollback to previous version.
Layer 4: Performance Regression Baseline (30 minutes)
Objective: Verify performance metrics haven't degraded by >10%.
Run the same workload under both old and new images, comparing startup time, memory usage, request latency, and throughput.
Metrics to Collect
Metric | Tool | Threshold | Action if Exceeded |
|---|---|---|---|
Startup time (cold start) |
| <10% increase | Investigate initialization |
Memory footprint (RSS) |
| <5% increase | Check for memory leaks |
Request latency (p95) |
| <10% increase | Profile hotspots |
Throughput (RPS) |
| <10% decrease | Check for CPU/syscall overhead |
Garbage collection pause (if applicable) |
| <5% increase | Heap size may need tuning |
Performance Benchmark Script
#!/bin/bash OLD_IMAGE=$1NEW_IMAGE=$2 echo "📊 Performance baseline comparison: $OLD_IMAGE → $NEW_IMAGE" # Helper function to run benchmarkrun_benchmark() { local IMAGE=$1 local LABEL=$2 local CONTAINER="perf-test-$(date +%s)" echo "Starting $LABEL..." docker run -d \ --name "$CONTAINER" \ -p 8080:8080 \ --cpus=2 \ --memory=1g \ "$IMAGE" & sleep 10 # Warmup # Measure startup time START=$(date +%s%N) curl -s http://localhost:8080/health > /dev/null END=$(date +%s%N) STARTUP_TIME=$(( (END - START) / 1000000 )) # ms # Run load test (30 sec @ 100 RPS) wrk -t4 -c100 -d30s -R100 \ --script /tmp/latency.lua \ http://localhost:8080/ \ > /tmp/perf-${LABEL}.txt # Collect memory stats PEAK_MEMORY=$(docker stats --no-stream "$CONTAINER" | tail -1 | awk '{print $4}') # Cleanup docker stop "$CONTAINER" docker rm "$CONTAINER" echo "$LABEL: startup=${STARTUP_TIME}ms, memory=$PEAK_MEMORY" return 0} # Run benchmarksrun_benchmark "$OLD_IMAGE" "OLD"OLD_STARTUP=$(grep "startup=" /tmp/perf-OLD.txt | awk -F= '{print $2}' | awk '{print $1}')OLD_MEMORY=$(grep "memory=" /tmp/perf-OLD.txt | awk -F= '{print $2}' | awk '{print $1}') run_benchmark "$NEW_IMAGE" "NEW"NEW_STARTUP=$(grep "startup=" /tmp/perf-NEW.txt | awk -F= '{print $2}' | awk '{print $1}')NEW_MEMORY=$(grep "memory=" /tmp/perf-NEW.txt | awk -F= '{print $2}' | awk '{print $1}') # CompareSTARTUP_DELTA=$(( (NEW_STARTUP - OLD_STARTUP) * 100 / OLD_STARTUP ))MEMORY_DELTA=$(( (NEW_MEMORY - OLD_MEMORY) * 100 / OLD_MEMORY )) echo "📈 Results:"echo " Startup: ${OLD_STARTUP}ms → ${NEW_STARTUP}ms (${STARTUP_DELTA:+${STARTUP_DELTA}>0?+:}-${STARTUP_DELTA#-}%)"echo " Memory: ${OLD_MEMORY} → ${NEW_MEMORY} (${MEMORY_DELTA:+${MEMORY_DELTA}>0?+:}-${MEMORY_DELTA#-}%)" if [ "$STARTUP_DELTA" -gt 10 ] || [ "$MEMORY_DELTA" -gt 10 ]; then echo "⚠️ WARN: Metrics degraded >10%" exit 1fi echo "✅ Performance regression PASSED"exit 0Pass/Fail Criteria
✅ PASS: All metrics within thresholds. ⚠️ WARN: 1–2 metrics exceed threshold by <20%; proceed with caution. ❌ FAIL: >2 metrics exceed thresholds or >20% degradation. Action on WARN: Document regression, proceed if acceptable; monitor in production. Action on FAIL: Investigate root cause; consider version downgrade.
Layer 5: Security Regression Verification (10 minutes)
Objective: Confirm hardening features (shell-less, read-only FS, non-root) are intact.
CleanStart images ship with security features enabled by default. This layer verifies they haven't been accidentally disabled or bypassed in the new version.
Security Hardening Checklist
#!/bin/bash IMAGE=$1CONTAINER="security-test-$(date +%s)" echo "🔒 Security hardening verification for $IMAGE" # Start containerdocker run -d \ --name "$CONTAINER" \ "$IMAGE" & sleep 3 # 1. Verify non-root userecho -n "Checking non-root user... "if docker exec "$CONTAINER" id | grep -q "uid=65532"; then echo "✅"else echo "❌ FAIL: Not running as UID 65532" docker rm -f "$CONTAINER" exit 1fi # 2. Verify no shellecho -n "Checking shell-less mode... "if docker exec "$CONTAINER" test ! -f /bin/sh > /dev/null 2>&1; then echo "✅"else echo "⚠️ WARN: Shell found (not required for all images)"fi # 3. Verify read-only root filesystemecho -n "Checking read-only root FS... "if docker run --read-only "$IMAGE" /app/health-check > /dev/null 2>&1; then echo "✅"else echo "❌ FAIL: Application requires writable root FS (unexpected)" docker rm -f "$CONTAINER" exit 1fi # 4. Verify signatureecho -n "Verifying Cosign signature... "if cosign verify --certificate-identity-regexp='^https://github.com/cleanstart' \ --certificate-oidc-issuer=https://token.actions.githubusercontent.com \ "$IMAGE" > /dev/null 2>&1; then echo "✅"else echo "⚠️ WARN: Signature verification failed (check certificate)"fi # 5. Verify SBOM presentecho -n "Verifying SBOM present... "if cosign blob-url "$IMAGE" sbom | jq .components > /dev/null 2>&1; then echo "✅"else echo "❌ FAIL: SBOM not found" docker rm -f "$CONTAINER" exit 1fi # Cleanupdocker rm -f "$CONTAINER" echo "✅ Security hardening PASSED"exit 0Pass/Fail Criteria
✅ PASS: All hardening features verified. ⚠️ WARN: Signature verification failed (check certificate chain). ❌ FAIL: Non-root user missing, read-only FS fails, SBOM missing. Action on FAIL: Do not promote to production; escalate to security team.
Layer 6: Soak Testing (24–72 hours)
Objective: Detect subtle issues (memory leaks, thread leaks, connection pool exhaustion) that only appear under sustained load.
Soak testing is mandatory for major version upgrades and distro changes. Run the application under continuous load for 24–72 hours, monitoring for degradation.
Soak Test Setup
#!/bin/bash IMAGE=$1DURATION_HOURS=${2:-24}INTERVAL_SECONDS=60 CONTAINER="soak-test-$(date +%s)" echo "🔄 Starting ${DURATION_HOURS}-hour soak test for $IMAGE" # Start containerdocker run -d \ --name "$CONTAINER" \ -e "SOAK_TEST=true" \ --cpus=2 \ --memory=2g \ -p 8080:8080 \ "$IMAGE" & sleep 10 # Monitor loopELAPSED=0MAX_SECONDS=$(( DURATION_HOURS * 3600 ))ITERATION=0 while [ $ELAPSED -lt $MAX_SECONDS ]; do ITERATION=$(( ITERATION + 1 )) ELAPSED=$(( ITERATION * INTERVAL_SECONDS )) HOURS_ELAPSED=$(( ELAPSED / 3600 )) # Collect metrics STATS=$(docker stats --no-stream "$CONTAINER" | tail -1) MEMORY=$(echo "$STATS" | awk '{print $4}') CPU=$(echo "$STATS" | awk '{print $3}') # Check application health HEALTH=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:8080/health) # Check for memory growth (should stabilize after warmup) if [ "$ITERATION" -eq 1 ]; then BASELINE_MEMORY="$MEMORY" echo "Baseline memory: $BASELINE_MEMORY" else # Extract numeric value (e.g., "256MiB" → "256") BASELINE_MB=$(echo "$BASELINE_MEMORY" | sed 's/[^0-9]*//g') CURRENT_MB=$(echo "$MEMORY" | sed 's/[^0-9]*//g') GROWTH=$(( (CURRENT_MB - BASELINE_MB) * 100 / BASELINE_MB )) # Alert if memory grows >30% (likely leak) if [ "$GROWTH" -gt 30 ]; then echo "⚠️ Memory leak suspected: grew ${GROWTH}% (${BASELINE_MB}MB → ${CURRENT_MB}MB)" fi fi # Log progress printf "[%2dh:%02dm] Health: $HEALTH | Memory: $MEMORY | CPU: $CPU\n" \ $(( HOURS_ELAPSED )) $(( (ELAPSED % 3600) / 60 )) # Check health endpoint if [ "$HEALTH" != "200" ]; then echo "❌ FAIL: Health check returned $HEALTH" docker logs "$CONTAINER" | tail -20 docker rm -f "$CONTAINER" exit 1 fi sleep "$INTERVAL_SECONDS"done # Final diagnosticsecho ""echo "📊 Soak test complete. Final diagnostics:"docker logs "$CONTAINER" | grep -i "error\|warning" | tail -20 # Cleanupdocker rm -f "$CONTAINER" echo "✅ Soak test PASSED"exit 0Soak Test Metrics to Monitor
Metric | Check Frequency | Threshold | Action |
|---|---|---|---|
Memory (RSS) | Every 60s | <30% growth from baseline | Stop test if >50% growth |
CPU utilization | Every 60s | <80% sustained | Reduce load if exceeding |
File descriptor count | Every 300s | <1024 open | Stop test if approaching limit |
Connection pool size | Every 300s | Stable/bounded | Investigate if growing unbounded |
Health check success rate | Every 60s | >99% | Stop test on failure |
Application errors | Every 60s | <1 error/minute | Escalate if error rate increasing |
Pass/Fail Criteria
✅ PASS: All metrics stable, no memory leaks, 100% health check success. ⚠️ WARN: Memory growth <20%, occasional (1–2) health check failures, recoverable. ❌ FAIL: Memory growth >30%, sustained health check failures, connection leaks. Action on FAIL: Stop test, review logs, consider rolling back to previous version.
Automation Examples: CI/CD Integration
GitHub Actions Workflow
name: Image Regression Test on: workflow_dispatch: inputs: old_image: description: 'Previous image tag (for comparison)' required: true default: 'registry.cleanstart.com/py:3.12.1' new_image: description: 'New image tag to test' required: true default: 'registry.cleanstart.com/py:3.12.2' soak_hours: description: 'Soak test duration (0 = skip)' required: false default: '0' jobs: regression-test: runs-on: ubuntu-latest timeout-minutes: 120 steps: - name: Checkout tests uses: actions/checkout@v4 - name: Layer 1 - Diff Analysis run: | ./scripts/layer1-diff-analysis.sh "${{ inputs.new_image }}" - name: Layer 2 - Smoke Tests run: | ./scripts/layer2-smoke-tests.sh "${{ inputs.new_image }}" - name: Layer 3 - Functional Regression run: | npm test -- --coverage env: TEST_IMAGE: "${{ inputs.new_image }}" - name: Layer 4 - Performance Baseline run: | ./scripts/layer4-perf-baseline.sh "${{ inputs.old_image }}" "${{ inputs.new_image }}" - name: Layer 5 - Security Hardening run: | ./scripts/layer5-security-check.sh "${{ inputs.new_image }}" - name: Layer 6 - Soak Test (if requested) if: ${{ inputs.soak_hours != '0' }} run: | ./scripts/layer6-soak-test.sh "${{ inputs.new_image }}" "${{ inputs.soak_hours }}" timeout-minutes: ${{ inputs.soak_hours * 60 + 30 }} - name: Upload results if: always() uses: actions/upload-artifact@v4 with: name: regression-results path: ./test-results/ - name: Comment on PR if: always() uses: actions/github-script@v7 with: script: | const fs = require('fs'); const results = JSON.parse(fs.readFileSync('./test-results/summary.json')); github.rest.issues.createComment({ issue_number: context.issue.number, owner: context.repo.owner, repo: context.repo.repo, body: `## Image Regression Test Results\n\n${results.summary}` });GitLab CI Pipeline
image: docker:latest stages: - diff-analysis - smoke-test - functional-test - performance-test - security-test - soak-test variables: OLD_IMAGE: ${OLD_IMAGE:-registry.cleanstart.com/py:3.12.1} NEW_IMAGE: ${NEW_IMAGE:-registry.cleanstart.com/py:3.12.2} SOAK_HOURS: ${SOAK_HOURS:-0} layer1-diff: stage: diff-analysis script: - ./scripts/layer1-diff-analysis.sh $NEW_IMAGE artifacts: reports: junit: results/diff-report.xml layer2-smoke: stage: smoke-test services: - docker:dind script: - ./scripts/layer2-smoke-tests.sh $NEW_IMAGE artifacts: reports: junit: results/smoke-report.xml layer3-functional: stage: functional-test services: - docker:dind script: - docker run $NEW_IMAGE npm test -- --reporter=junit --outputFile=results/functional.xml artifacts: reports: junit: results/functional.xml layer4-perf: stage: performance-test services: - docker:dind script: - ./scripts/layer4-perf-baseline.sh $OLD_IMAGE $NEW_IMAGE artifacts: paths: - results/perf-*.json layer5-security: stage: security-test script: - ./scripts/layer5-security-check.sh $NEW_IMAGE artifacts: reports: junit: results/security-report.xml layer6-soak: stage: soak-test services: - docker:dind script: - ./scripts/layer6-soak-test.sh $NEW_IMAGE $SOAK_HOURS artifacts: paths: - results/soak-*.json only: - schedules timeout: 72hGo/No-Go Decision Matrix
Use this matrix to make promotion decisions at each testing stage.
Single-Stage Decision Matrix
Layer | ✅ GO | ⚠️ PROCEED WITH CAUTION | ❌ NO-GO |
|---|---|---|---|
1: Diff | Fewer vulns | Same vulns | More critical/high vulns |
2: Smoke | All pass | 1 timeout | Any hard errors |
3: Functional | >99% pass | 95-99% pass (known failures) | <95% pass |
4: Perf | <5% regression | 5-10% regression | >10% regression |
5: Security | All pass | Signature warn | Non-root/SBOM fail |
6: Soak | Stable | <20% memory growth | >30% growth or health failures |
Full Promotion Criteria
For patch upgrades (3.12.1 → 3.12.2): ✅ GO if: Layer 2 (smoke) + Layer 5 (security) both pass.
For minor upgrades (3.12 → 3.13): ✅ GO if: Layers 2 + 3 + 4 + 5 all pass.
For major upgrades (Python 3 → 4): ✅ GO if: Layers 2 + 3 + 4 + 5 + 6 all pass, soak ≥24 hours.
For hotfix releases: ✅ GO if: Layer 2 (smoke) + 5 (security) pass + specific hotfix issue resolved.
Rollback Procedure
If regression testing fails at any stage, use this procedure to restore the previous image version.
Immediate Rollback (Production)
# 1. Identify running containers with new imagedocker ps -a | grep "new-image-tag" # 2. Redeploy previous image versionkubectl set image deployment/myapp \ app=registry.cleanstart.com/py:3.12.1@sha256:abc123... # 3. Monitor rolloutkubectl rollout status deployment/myapp --timeout=5m # 4. Verify traffic is routing correctlycurl https://api.example.com/health # 5. Capture logs for diagnosticskubectl logs -l app=myapp --tail=1000 > rollback-logs.txtRoot Cause Analysis (Post-Rollback)
- Review test logs: What layer failed and why?
- Check image metadata: Did SBOM/signature change unexpectedly?
- Inspect error patterns: Is it a library incompatibility or application bug?
- Contact image maintainer: If the new image is the issue, file a bug report with test results
- Document findings: Update your regression test suite to catch this failure in the future
Prevention for Future Upgrades
# Add custom regression test for this specific issuecat > tests/regression-py312-issue.test.js << 'EOF'describe('Python 3.12 specific regression', () => { it('should handle X without error', async () => { // Test that specifically covers the failed scenario });});EOF # Ensure test runs as part of Layer 3 functional suitegit add tests/regression-py312-issue.test.jsgit commit -m "Add regression test for Python 3.12 issue"What to Read Next
Performance Baseline Testing: Detailed guide for establishing performance metrics. Security Hardening Reference: Deep-dive into shell-less, read-only, non-root architecture. Image Upgrade Checklist: Step-by-step promotion from staging to production. Troubleshooting: Regression Test Failures: Solutions for common test failures.
