This reference catalogs all CleanStart container images designed for AI and machine learning workloads. Each image is built from verified source, ships with SBOM and SLSA attestations, and is available in standard and FIPS variants.
Image Naming Convention
CleanStart container images follow a consistent naming pattern for predictable identification and deployment:
Pattern: registry.cleanstart.com/{framework}:{version}-{variant}-{arch}
The component breakdown includes {framework} representing the base framework or runtime (pytorch, tensorflow, ollama, vllm, etc.), {version} representing the framework or application version (e.g., 2.2, 0.3, 1.4), {variant} representing the build configuration (prod, dev, fips), and {arch} representing the CPU architecture (amd64, arm64).
Examples of this naming convention include registry.cleanstart.com/pytorch:2.2-prod-amd64, registry.cleanstart.com/ollama:0.3-fips-arm64, and registry.cleanstart.com/vllm:0.4-dev-amd64.
Variant definitions clarify what each option includes. The prod variant provides a minimal runtime footprint that is production-ready with security patches only. The dev variant includes build tools, compilers, and development headers. The fips variant includes FIPS 140-3 certified cryptographic modules with validated binaries.
Architecture support covers amd64 for x86-64 processors (Intel, AMD) and arm64 for ARM-based processors (Apple Silicon, AWS Graviton, Ampere).
Framework Images
PyTorch
Production deep learning framework for training and inference workloads.
Tag | Size | GPU Support | Included | CUDA Version | Use Case |
|---|---|---|---|---|---|
| 3.2 GB | Yes | PyTorch, torchvision, torchaudio | 12.1 | Training, inference |
| 5.1 GB | Yes | + build tools, gcc, g++, cmake | 12.1 | Model development |
| 3.0 GB | Yes | PyTorch, torchvision, torchaudio | 11.8 | Legacy support |
| 3.4 GB | Yes | + FIPS crypto, validated libs | 11.8 | Compliance-required |
PyTorch images include cuDNN 8.9, NCCL 2.18, and Flash Attention optimizations for A100/H100 GPUs.
TensorFlow
End-to-end ML platform for classification, regression, and structured data.
Tag | Size | GPU Support | Included | CUDA Version | Use Case |
|---|---|---|---|---|---|
| 3.4 GB | Yes | TensorFlow, Keras | 12.1 | Production inference |
| 6.2 GB | Yes | + TensorFlow source, build deps | 12.1 | Model training |
| 3.2 GB | Yes | TensorFlow, Keras | 11.8 | Legacy support |
| 3.7 GB | Yes | + FIPS crypto modules | 12.1 | Compliance-required |
Includes TensorFlow Lite runtime, TensorFlow Serving compatibility, and ONNX converter.
ONNX Runtime
Cross-platform inference engine optimized for deployment and mobile.
Tag | Size | GPU Support | Included | CUDA Version | Use Case |
|---|---|---|---|---|---|
| 1.8 GB | Yes | ONNX Runtime, CPU/GPU kernels | 12.1 | Fast inference |
| 2.4 GB | Yes | + ONNX Tools, model converters | 12.1 | Model optimization |
| 2.0 GB | Yes | + FIPS crypto | 12.1 | Compliance-required |
ONNX Runtime includes quantization support, graph optimization, and TensorRT integration for NVIDIA GPUs.
JAX
Research-focused framework for differentiable computing and large-scale parallelism.
Tag | Size | GPU Support | Included | CUDA Version | Use Case |
|---|---|---|---|---|---|
| 2.1 GB | Yes | JAX, jax.numpy, jax.scipy | 12.1 | Research, training |
| 4.3 GB | Yes | + NumPy, SciPy, IPython | 12.1 | Development |
| 2.3 GB | Yes | + FIPS crypto | 12.1 | Compliance-required |
JAX images include XLA compiler, multi-GPU support via jax.distribute, and Flax neural network library.
Model Runner Images
Purpose-built images for serving and running inference on pre-trained models.
Ollama
Local model inference with REST API server. Optimized for consumer hardware and edge deployment.
Tag | Size | GPU Support | Included | Use Case |
|---|---|---|---|---|
| 1.2 GB | Yes | Ollama runtime, API server | Local inference |
| 2.1 GB | Yes | + model tools, debugger | Model development |
| 1.4 GB | Yes | + FIPS crypto | Compliance-required |
Supports Llama 2, Mistral, Phi, and 500+ community models. Automatic model download and GPU acceleration on NVIDIA/AMD.
vLLM
High-throughput LLM inference engine with tensor parallelism and paged attention.
Tag | Size | GPU Support | Included | Use Case |
|---|---|---|---|---|
| 4.1 GB | Yes | vLLM, CUDA kernels, paged attn | Multi-GPU inference |
| 5.8 GB | Yes | + build tools, profiler | Optimization, tuning |
| 3.9 GB | Yes | vLLM (legacy) | Production (stable) |
| 4.3 GB | Yes | + FIPS crypto | Compliance-required |
Enables 10-40x higher throughput than standard inference. Supports HuggingFace models, GPTQ quantization, and LoRA adapters.
Text Generation Inference (TGI)
HuggingFace's optimized serving runtime for large language models and encoder models.
Tag | Size | GPU Support | Included | Use Case |
|---|---|---|---|---|
| 3.6 GB | Yes | TGI server, flash-attention | Multi-GPU LLM serving |
| 4.9 GB | Yes | + HF transformers source | Model customization |
| 3.9 GB | Yes | + FIPS crypto | Compliance-required |
Includes dynamic batching, token streaming, automatic quantization, and bitsandbytes integration for 8-bit and 4-bit inference.
llama.cpp
CPU-optimized inference for Llama and compatible models. No GPU required.
Tag | Size | GPU Support | Included | Use Case |
|---|---|---|---|---|
| 680 MB | Optional | llama.cpp binary, ggml libs | CPU inference |
| 1.2 GB | Optional | + source, build tools | Custom builds |
| 750 MB | Optional | + FIPS crypto | Compliance-required |
Runs GGUF-quantized models on CPUs with 4-bit and 8-bit quantization support. Memory-efficient for edge and resource-constrained environments.
Triton Inference Server
Multi-framework model serving with ensemble pipelines and auto-scaling.
Tag | Size | GPU Support | Included | Use Case |
|---|---|---|---|---|
| 4.8 GB | Yes | Triton server, TensorRT | Multi-model serving |
| 6.1 GB | Yes | + source, examples | Server customization |
| 5.1 GB | Yes | + FIPS crypto | Compliance-required |
Supports TensorFlow, PyTorch, ONNX, and custom backends. Includes metrics export, model versioning, and dynamic batching.
GPU Support Matrix
Verified GPU compatibility across CleanStart AI images.
Image | H100 | A100 | L40S | A10 | T4 | CPU Only | CUDA Version | cuDNN |
|---|---|---|---|---|---|---|---|---|
PyTorch 2.2 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 12.1 | 8.9 |
PyTorch 2.1 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 11.8 | 8.9 |
TensorFlow 2.15 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 12.1 | 8.8 |
TensorFlow 2.14 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 11.8 | 8.8 |
ONNX 1.17 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 12.1 | — |
JAX 0.4 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 12.1 | — |
vLLM 0.4 | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ | 12.1 | — |
TGI 1.4 | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ | 12.1 | 8.9 |
Ollama 0.3 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 11.8 | — |
llama.cpp | — | — | — | — | — | ✓ | — | — |
Triton 2.42 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 12.1 | 8.9 |
H100 and A100 GPUs support dynamic tensor parallelism and multi-GPU scaling. L40S, A10, and T4 optimized for single-GPU inference.
Image Size Comparison
CleanStart prod images reduce deployment footprint compared to standard upstream images.
Framework | Standard Size | CleanStart Prod | CleanStart Dev | Reduction |
|---|---|---|---|---|
PyTorch | 4.8 GB | 3.2 GB | 5.1 GB | 33% |
TensorFlow | 5.2 GB | 3.4 GB | 6.2 GB | 35% |
ONNX Runtime | 2.6 GB | 1.8 GB | 2.4 GB | 31% |
JAX | 3.1 GB | 2.1 GB | 4.3 GB | 32% |
vLLM | 5.3 GB | 4.1 GB | 5.8 GB | 23% |
TGI | 4.9 GB | 3.6 GB | 4.9 GB | 27% |
Ollama | 1.8 GB | 1.2 GB | 2.1 GB | 33% |
Triton | 6.1 GB | 4.8 GB | 6.1 GB | 21% |
Size reduction achieved through layer deduplication, removal of documentation and examples, and hardened base image. Dev variants include full build chains at cost of size.
Security Artifacts Per Image
Every CleanStart AI image includes cryptographic proof of origin, composition, and known vulnerabilities.
Every release ships with five artifacts. The first is an SPDX 3.0 SBOM—a complete software bill of materials in SPDX JSON format—which can be downloaded with cosign download sbom registry.cleanstart.com/pytorch:2.2-prod-amd64 --output-file sbom.spdx.json. The second is a CycloneDX 1.4 SBOM, an alternative SBOM format for tool compatibility, available via cosign download sbom registry.cleanstart.com/pytorch:2.2-prod-amd64 --predicate-type cyclonedx --output-file sbom.xml.
Third is the SLSA Level 4 Provenance, the complete build history with source verification, obtained by cosign download attestation registry.cleanstart.com/pytorch:2.2-prod-amd64 --predicate-type slsaprovenance > provenance.json. Fourth is the Cosign Signature, a cryptographic signature verifying image authenticity, which can be verified with cosign verify registry.cleanstart.com/pytorch:2.2-prod-amd64 --certificate-identity-regexp "https://github.com/cleanstart" --certificate-oidc-issuer https://token.actions.githubusercontent.com. Finally, there is a VEX Document containing known-not-affected CVEs with justification, downloadable via cosign download attestation registry.cleanstart.com/pytorch:2.2-prod-amd64 --predicate-type vex > vex.json.
The verification workflow is straightforward. First verify image integrity and signature with cosign verify registry.cleanstart.com/pytorch:2.2-prod-amd64, then pull SBOM and check dependencies with cosign download sbom registry.cleanstart.com/pytorch:2.2-prod-amd64 | grype -.
All artifacts stored in image registry alongside container image. Verification keys published at https://registry.cleanstart.com/.well-known/cosign.pub.
Helm Chart Values for AI Images
Deploy CleanStart AI images using Helm charts. Example values for common scenarios.
PyTorch training job:
image: repository: registry.cleanstart.com/pytorch tag: "2.2-prod-amd64" pullPolicy: IfNotPresent resources: limits: nvidia.com/gpu: "2" requests: memory: "16Gi" cpu: "8" gpu: enabled: true type: "a100" count: 2vLLM inference server:
image: repository: registry.cleanstart.com/vllm tag: "0.4-prod-amd64" replicaCount: 2 resources: limits: nvidia.com/gpu: "4" requests: memory: "24Gi" cpu: "16" service: port: 8000 env: VLLM_TENSOR_PARALLEL_SIZE: "4"Ollama inference:
image: repository: registry.cleanstart.com/ollama tag: "0.3-prod-amd64" resources: limits: nvidia.com/gpu: "1" requests: memory: "8Gi" cpu: "4" persistence: enabled: true size: "50Gi" mountPath: /root/.ollamaTensorFlow serving:
image: repository: registry.cleanstart.com/tensorflow tag: "2.15-prod-amd64" config: model_server_args: - "--model_config_file=/models/models.config" - "--enable_batching" volumes: - name: models persistentVolumeClaim: claimName: tf-modelsFull chart repository at https://helm.cleanstart.dev. Install with helm repo add cleanstart https://helm.cleanstart.dev && helm install ai-stack cleanstart/ai-stack.
Bitnami Compatibility for AI Infrastructure
CleanStart AI images work seamlessly with Bitnami charts for supporting services.
Redis for inference caching uses the Bitnami chart version 17.11+ with Redis 7.0+ with module support. This enables caching of LLM completions, embedding results, and model metadata, with compatibility for vLLM, TGI, Ollama, and Triton.
PostgreSQL for model metadata uses the Bitnami chart version 12.1+ with PostgreSQL 14+ with pgvector extension. This allows you to store model configs, experiment metadata, and inference logs, compatible with all AI images.
NGINX for API gateway uses the Bitnami chart version 13.1+ with NGINX 1.24+ with stream module. This enables load balancing of inference servers, SSL termination, and request routing, compatible with vLLM, TGI, Ollama, and Triton.
Example stack values.yaml:
redis: enabled: true replica: replicaCount: 2 persistence: size: "20Gi" postgresql: enabled: true auth: postgresPassword: "secure-password" primary: persistence: size: "50Gi" extensions: pgvector: true nginx: enabled: true replicaCount: 2 resources: limits: memory: "512Mi" cpu: "500m"Bitnami charts use CleanStart hardened base images by default when available.
Version Support Policy
CleanStart maintains security patches and CUDA compatibility guarantees for each AI image version.
The support lifecycle follows three phases. The Active phase lasts 18 months from release and includes security patches, CUDA updates, and dependency upgrades. The Maintenance phase follows for 12 months after the active phase and includes critical security patches only, with no new features. Finally, Deprecated images are removed from the registry with no support.
Framework versions include PyTorch 2.2 active until October 2026, PyTorch 2.1 in maintenance until April 2026, TensorFlow 2.15 active until November 2026, TensorFlow 2.14 in maintenance until May 2026, ONNX 1.17 active until January 2027, JAX 0.4 active until September 2026, vLLM 0.4 active until September 2026, TGI 1.4 active until April 2027, Ollama 0.3 active until March 2027, and Triton 2.42 active until June 2027.
CUDA compatibility is guaranteed through December 2027 for CUDA 12.1 (NVIDIA LTS), through June 2026 for CUDA 11.8, with automatic updates to latest point releases (12.1.0 → 12.1.1, etc.).
The deprecation timeline announces deprecation 6 months before end-of-life, provides a final patch release 3 months before removal, and removes images from registry 12 months after the last patch.
Related Documentation
AI/ML Container Stack Explained — Architecture and design rationale. Deploying AI Containers to Production — End-to-end deployment guide. Helm Chart Reference — Chart configuration and values. Image Catalog — All CleanStart images including non-AI workloads.
