Container registries store and distribute Docker and OCI images for your infrastructure. As teams push more images, registries grow rapidly — a few hundred images can consume hundreds of gigabytes. Effective registry management includes garbage collection, image lifecycle policies, vulnerability scanning, and access control. Three open-source registry solutions address these needs: Harbor, Distribution (Docker Registry), and Zot.
Why Self-Host a Container Registry?
Public registries like Docker Hub have rate limits, bandwidth costs, and no control over image retention. Self-hosting provides:
- No rate limits — pull images as fast as your network allows
- Full control — set retention policies, scan for vulnerabilities, manage access
- Air-gapped support — operate without internet connectivity
- Cost savings — eliminate egress fees for large image pulls
- Compliance — keep images within your network boundary for regulatory requirements
- Faster builds — local registry eliminates network latency for image pulls
Harbor
Harbor is an open-source container registry with enterprise features including vulnerability scanning, image signing, replication, and role-based access control. It is a CNCF graduated project.
GitHub: goharbor/harbor — actively maintained, CNCF graduated project
Key Features
- Vulnerability scanning — integrated Trivy scanning for every pushed image
- Image garbage collection — built-in GC with scheduling and soft-delete support
- Image retention policies — automatically delete old images based on tags, age, or count
- Replication — push/pull images between multiple Harbor instances or to external registries
- RBAC — fine-grained access control with project-level permissions
- Image signing — Notary integration for content trust
- Audit logging — track all registry operations
- LDAP/AD integration — enterprise authentication
Docker Compose Configuration
Harbor provides an official installer, but can be deployed via Docker Compose:
| |
Garbage collection configuration via Harbor API:
| |
Distribution (Docker Registry)
Distribution is the reference implementation of the OCI Distribution Specification. It is the engine behind Docker Hub and many other registries, providing a minimal, high-performance registry core.
GitHub: distribution/distribution — 10,405 stars
Key Features
- OCI-compliant — fully implements the OCI Distribution Specification
- Minimal footprint — single binary, no database required
- Storage drivers — supports local filesystem, S3, Azure Blob, GCS, Swift, and more
- Garbage collection — built-in GC command for reclaiming unused layers
- Token authentication — integrates with external auth providers
- Notifications — webhook notifications for push/pull events
- Content trust — supports Notary for image signing
Docker Compose Configuration
| |
Configuration file for garbage collection:
| |
Running garbage collection:
| |
Zot
Zot is a next-generation OCI-native container registry designed for simplicity and performance. It is written in Go and focuses on being lightweight while providing essential registry features.
GitHub: project-zot/zot — actively maintained, CNCF sandbox project
Key Features
- OCI-native — fully compliant with OCI Distribution and Image specs
- Single binary — no external dependencies, no database required
- Built-in garbage collection — automatic and manual GC modes
- Image scanning — integrated Trivy scanning
- Storage deduplication — shared blob storage across repositories
- Multi-architecture support — native manifest list handling
- S3-compatible storage — direct backend support for S3, MinIO
- Low resource usage — designed for edge and resource-constrained environments
Docker Compose Configuration
| |
Configuration file:
| |
Feature Comparison
| Feature | Harbor | Distribution | Zot |
|---|---|---|---|
| OCI compliant | ✅ Yes | ✅ Yes (reference impl) | ✅ Yes |
| Garbage collection | ✅ Scheduled + API | ✅ CLI command | ✅ Automatic + manual |
| Vulnerability scanning | ✅ Trivy built-in | ❌ External | ✅ Trivy built-in |
| Web UI | ✅ Full-featured | ❌ None | ✅ Basic UI |
| RBAC | ✅ Project-level | ❌ Token-based | ✅ API keys |
| Image retention policies | ✅ Configurable | ❌ Manual GC only | ✅ Configurable |
| Replication | ✅ Multi-registry | ❌ Pull-through only | ✅ Planned |
| Storage backends | Local, S3 | Local, S3, Azure, GCS | Local, S3, MinIO |
| Storage deduplication | ✅ | ❌ | ✅ Native |
| Database required | ✅ PostgreSQL | ❌ | ❌ |
| Resource usage | High (multi-service) | Low (single binary) | Low (single binary) |
| LDAP/AD integration | ✅ Yes | Via token auth | Planned |
| Audit logging | ✅ Comprehensive | Via notifications | Basic |
| CNCF status | Graduated | Graduated | Sandbox |
| Best for | Enterprise teams | Minimal setups | Edge/cloud-native |
Why Registry Management Matters
As container adoption grows, registries become critical infrastructure. Without proper management:
Storage costs explode — each image layer is stored separately. A single multi-stage build can produce dozens of layers. Without garbage collection, deleted tags leave orphaned layers consuming disk space indefinitely.
Security risks increase — old images may contain known vulnerabilities. Without automated scanning and retention policies, your registry becomes a repository of insecure base images that teams unknowingly deploy.
Build performance degrades — large registries with millions of blobs slow down manifest lookups and layer downloads. Regular cleanup and deduplication keep response times low.
Compliance gaps appear — regulatory requirements often mandate image provenance tracking, vulnerability reporting, and access audit trails. Unmanaged registries cannot provide these guarantees.
For container image security, see our container security scanning guide and supply chain security article. If you need container orchestration alongside your registry, our Kubernetes comparison covers the options.
FAQ
What is container registry garbage collection?
Garbage collection (GC) removes unreferenced image layers from the registry storage. When you delete an image tag, the manifest reference is removed, but the underlying layers (blobs) remain on disk. GC identifies blobs that no manifest references and deletes them, reclaiming disk space.
How often should I run garbage collection?
For active development registries, run GC weekly or daily during low-traffic periods. For production registries with stable images, monthly GC is usually sufficient. The frequency depends on your push/delete volume — high-velocity CI/CD pipelines generate more orphaned layers and need more frequent cleanup.
Does garbage collection affect running containers?
No. Garbage collection only removes layers that are no longer referenced by any manifest in the registry. Running containers have already pulled their layers locally and do not depend on the registry for continued operation. However, if you need to re-pull a deleted image, it will no longer be available.
What is storage deduplication in container registries?
Storage deduplication identifies identical image layers across different repositories and stores only one copy. For example, if 10 different images are based on the same Ubuntu base image, deduplication stores the Ubuntu layers once instead of 10 times. This can reduce storage usage by 30-70% in registries with many related images.
Can I use a self-hosted registry with Kubernetes?
Yes. Configure your Kubernetes cluster to use your self-hosted registry by adding it as an insecure registry (for HTTP) or configuring TLS certificates. In Docker-based setups, add the registry URL to the Docker daemon configuration. For containerd, configure the registry mirror in /etc/containerd/config.toml.
What is the difference between Harbor and Distribution?
Distribution is the minimal OCI registry implementation — it stores and serves images with basic authentication. Harbor is a full-featured registry platform built on top of Distribution, adding a web UI, vulnerability scanning, RBAC, replication, audit logging, and project management. Choose Distribution for minimal setups and Harbor for enterprise requirements.