What a Container Registry Does: Core Responsibilities
A container registry is a service that stores, serves, and manages container images. It is the central repository where images are pushed after building and pulled before running.
A registry persists image layers, configurations, and manifests with layers stored content-addressable by SHA256 hash, enabling deduplication where identical layers across images are stored once. It responds to pull requests with image manifests and layers in streaming format optimizing bandwidth, allowing pulls to fetch only changed layers. It verifies user permissions through authentication mechanisms like username/password, API tokens, or cloud IAM. It maps tags (like myapp:v1.0, myapp:latest) to specific SHA256 digests where tags are mutable but digests are immutable. It stores and exposes image configuration, manifests describing layer ordering, and annotations. Finally, it enforces permissions through role-based access control (RBAC).
All registries implement the OCI Distribution Specification, a standardized protocol for image transfer. This means you can push to one registry and pull from another, and they will interoperate.
Registry Architecture: The Two-Component Model
Modern registries follow a two-component architecture that explains their behavior and performance characteristics.
The API Server (stateless) handles authentication requests, serves manifests and layer data, responds to registry API calls, can be load-balanced and replicated, and usually runs HTTPS on port 443 or 5000 in development.
The Storage Backend (stateful) stores image layers and configurations, uses local filesystem, cloud object storage (S3, GCS, Azure Blob), or NFS, provides single source of truth for image data, and is replicated/backed-up for durability.
API servers are stateless, making them horizontally scalable. Storage is the bottleneck: it must be durable, accessible from all API servers, and performant. Push/pull speed depends on storage backend speed. Multiple registry instances can serve the same storage backend. Storage availability is critical—if storage is down, the entire registry is down.
Public Registries: Shared Services
Docker Hub
Docker Hub is the first and most well-known registry. It's the default registry for docker pull—if you specify no registry, Docker Hub is assumed. It hosts approximately 6 million public images. Official images (nginx, node, python, postgres, etc.) are curated, security-scanned, and documented. Free tier access is limited to public images and rate-limited pulls, while paid tiers offer private images and faster pulls. Rate limits as of November 2020 enforce 100 pulls per 6 hours per IP address for free users and unlimited pulls for Pro tier, making large-scale pulls challenging without paid access.
Security: Docker offers image scanning for known vulnerabilities through third-party integration.
GitHub Container Registry (ghcr.io)
GitHub's container registry integrates with GitHub Actions, is free for public images, and paid for private. Access control ties to GitHub permissions. It integrates with GitHub releases and packages. Rate limits are reasonable for open source with higher limits for paid organizations. Vulnerability scanning is provided via Dependabot.
Quay.io
Red Hat's container registry, used by OpenShift and enterprises, offers advanced image scanning. Quay includes Clair security scanner and detailed vulnerability reports. It provides powerful search, filtering, organization management, and Kubernetes integration.
Cloud-Provider Registries: Integrated with Cloud Services
Amazon ECR (Elastic Container Registry)
ECR is AWS's container registry, deeply integrated with ECS, EKS, and Lambda. Authentication uses IAM, integrating with AWS access control. Automatic image scanning via ECR Basic Scanning (free) or ECR Enhanced Scanning (paid). Features include multi-region replication, lifecycle policies (auto-delete old images), and CodeBuild integration. Pricing is per GB stored and transferred.
Strengths: Deep AWS integration, strong image scanning, lifecycle management. Limitations: AWS-specific authentication, less flexible namespace structure.
Google Artifact Registry (formerly GCR)
Google Cloud's unified artifact repository handles Docker images, Helm charts, Maven packages, and more. It integrates with GKE, Cloud Run, and Cloud Build. Supports multiple artifact types beyond Docker. Includes vulnerability scanning via Binary Authorization and replication across regions.
Strengths: Unified artifact management, seamless GKE integration, strong scanning. Limitations: Google Cloud specific, authentication requires gcloud or service account.
Azure Container Registry (ACR)
Microsoft's container registry integrates with AKS and Azure DevOps. ACR Tasks enable scheduled builds and image imports. Geo-replication and Defender for containers scanning are included.
Strengths: Azure integration, strong build automation via ACR Tasks, flexible pricing. Limitations: Azure-specific, globally unique registry name requirement.
Self-Hosted Registries: Running Your Own
Harbor
Harbor is an open-source registry with enterprise features including RBAC, image replication, webhooks, and image scanning. Originally sponsored by VMware, now a CNCF graduated project. Features include fine-grained permissions, image replication to other Harbor instances or cloud registries, content trust and image signing, vulnerability scanning via Clair integration, webhooks for image events, and an intuitive web UI.
Deployment options include Docker Compose (single host) or Kubernetes (high availability).
Strengths: Enterprise features, good scanning and compliance integration, intuitive UI, community support. Limitations: Stateful service requiring persistent storage, operational complexity, infrastructure costs.
Zot (standalone OCI registry)
Zot is a minimal, zero-dependency OCI registry implementation. A single stateless binary with minimal resource footprint, full OCI Distribution spec compliance, image signing and SBOM support, and YAML/environment variable configuration.
Strengths: Minimal resource footprint, single binary deployment, no database required. Limitations: Fewer enterprise features (no RBAC, replication via external tools), smaller community.
Distribution (the reference implementation)
The official OCI Distribution specification reference implementation offers minimal, focused codebase, configurable via YAML, pluggable storage backends (filesystem, S3, Azure Blob, etc.), and API-only (no web UI).
Strengths: Minimal, focused, direct OCI spec compliance, well-suited for air-gapped/edge. Limitations: No web UI, limited features, requires additional tooling for production.
Feature Comparison
A comprehensive feature table shows that Docker Hub and ghcr.io support public images while cloud and self-hosted options focus on private images. Image scanning is near-universal except for minimal implementations. RBAC is available in cloud and self-hosted registries. Image replication is available in cloud and Harbor. Image signing is available in GCR, Harbor, and Zot. Web UI is available except for Distribution. Webhook support is available in all except Docker Hub and Zot. Self-hosted options are available for Harbor, Zot, and Distribution. Cloud integration varies by provider.
OCI Distribution Specification: Why Registries Are Interoperable
The OCI Distribution Specification defines how registries talk to clients and each other. All OCI-compliant registries implement identical API endpoints: GET /v2/ checks API v2 support, HEAD/GET /v2/<name>/manifests/<reference> gets image manifests, GET /v2/<name>/blobs/<digest> gets layer blobs, and manifest/blob push endpoints accept images.
Because all registries implement this protocol, you can push to Harbor, pull from ECR, mirror images between registries, and use registry pull-through caches.
OCI Artifacts: Beyond Container Images
The OCI spec has expanded beyond container images to support SBOMs, Helm Charts, and digital signatures.
Helm Charts: Packages for Kubernetes applications pushed to OCI registries. SBOMs: Component inventories stored alongside images via OCI Artifact Manifest. Signatures: Cryptographic signatures verifying image authenticity via Cosign. Attestations: Claims about images (build provenance, test results, scan results).
Modern registries support these artifact types via OCI Artifact Manifest.
Registry Security: Image Scanning, Signing, and Admission Control
Image Scanning
Scanning detects known CVEs by matching packages against CVE databases. Tools include Clair (open source, used by Harbor/Quay), Trivy (lightweight, from Aqua), Grype (Syft ecosystem), and Anchore. Most registries offer built-in scanning or easy integration.
Limitations: Only detects known CVEs (zero-day not caught), only finds system package vulnerabilities (not application code), requires maintained CVE databases.
Image Signing and Content Trust
Cosign (Sigstore) proves images come from trusted builders and haven't been tampered with. It signs images with private keys and verifies with public keys. Kubernetes admission control enforces signature verification, preventing unsigned images from deployment.
SBOM Generation and Attestation
SBOM is a detailed component inventory. Syft generates SBOMs in SPDX or CycloneDX formats. Cosign attaches and verifies SBOMs. Regulatory compliance (SLSA levels 2+), supply chain security, and rapid vulnerability response all benefit from SBOMs.
Registry Deployment Scenarios
Small Team, Public Open Source: Use Docker Hub (free, minimal management, widely known). Enterprise, Cloud-Native: Use cloud provider registry (ECR, GCR, ACR) for deep integration, managed operations, strong scanning. Air-Gapped, Private Infrastructure: Use self-hosted Harbor or Zot for full control, no internet dependency, offline support. High Security, Compliance-Driven: Use Harbor with additional scanning/signing for RBAC, image signing, scanning, audit logs. Minimal Footprint, Edge/IoT: Use Zot for single binary, minimal resources, stateless. Kubernetes-Only, Maximum Compatibility: Use cloud Artifact Registry (Google) or Harbor for native integration, ecosystem richness.
Pull-Through Caches and Mirror Registries
A pull-through cache intercepts pulls, fetches the image from upstream, caches it locally, and serves subsequent pulls from cache. Benefits include faster pulls from local cache, reduced bandwidth to external registries, image mirroring for air-gapped environments, and single point of control for image sources.
Registry Bandwidth and Cost Optimization
Push bandwidth (relatively rare, only during builds/deployments) should be optimized by building close to registry. Pull bandwidth (most expensive—every container start) is optimized through multi-stage builds, distroless/alpine base images, pull-through caches, regional registry replication, and cleanup policies.
Storage cost (per GB stored) is reduced through deduplication (identical layers counted once). A cost example: 500 images, 1 GB average, replicated to 3 regions = 1.5 TB storage at ~$30/month storage + bandwidth costs.
Registry Authentication and Credential Management
Docker Hub: Username/password or personal access tokens. Cloud Registries: AWS IAM, GCP service accounts, Azure managed identity. Self-Hosted: Basic auth, token auth, or OAuth. Kubernetes: Image pull secrets for private registries.
Best practices: Never hardcode credentials in Dockerfiles/manifests. Use cloud provider IAM/managed identities where possible. Rotate tokens regularly. Use read-only service accounts for pulling. Store credentials in Kubernetes secrets or cloud secret managers.
Next Steps: Understand how images are built and pushed. See "Multi-Stage Builds and Image Optimization" and "Container Image Building Layers Explained" for deeper technical details.
