When running a Kubernetes cluster with dozens or hundreds of nodes, pulling container images from a central registry becomes a major bottleneck. Every node downloads the same multi-gigabyte image simultaneously, saturating network bandwidth and slowing down deployments. Peer-to-peer (P2P) container image distribution solves this by letting nodes share image layers between themselves, dramatically reducing load on the central registry and speeding up rollout times.

In this guide, we compare the three leading open-source P2P image distribution solutions: Spegel, Kraken (by Uber), and Dragonfly 2 (by Alibaba). We’ll cover architecture differences, performance characteristics, deployment complexity, and provide real Docker Compose and Kubernetes manifests to get each one running.

Why Self-Host P2P Container Image Distribution?

Centralized container registries (Harbor, Docker Registry, AWS ECR) work fine for small clusters. But as you scale:

  • Network saturation: 50 nodes pulling a 2 GB image = 100 GB of bandwidth from your registry simultaneously
  • Slow deployments: Rolling updates block on image pulls, increasing deployment time linearly with node count
  • Registry single point of failure: If the registry goes down, no new pods can start
  • WAN edge latency: Edge or multi-region clusters suffer from slow pulls across geographic distances

P2P distribution turns every node into a cache. When one node pulls an image, it serves layers to other nodes that need them. The central registry only sees one download per unique layer, regardless of cluster size. This reduces registry bandwidth by 80-95% and cuts image pull times from minutes to seconds in large clusters.

For related reading on container infrastructure, see our Kubernetes CNI comparison, container sandboxing guide, and self-hosted container registry guide.

Spegel: Stateless P2P OCI Registry Mirror

Spegel (3,617 GitHub stars) is a stateless, lightweight P2P OCI registry mirror designed specifically for Kubernetes. It runs as a DaemonSet on every node and intercepts image pull requests, redirecting them to peers that already have the layers locally.

Architecture

Spegel uses a gossip protocol (via libp2p) for peer discovery and content routing. Each node runs a local OCI registry that mirrors images. When a node requests an image:

  1. Spegel checks if a peer has the layer
  2. If yes, it pulls from the peer via HTTP
  3. If no, it falls back to the upstream registry
  4. The layer is cached locally for serving to other peers

The key differentiator: Spegel is stateless. It doesn’t require any external database, coordination service, or scheduler. It simply discovers peers via Kubernetes Service DNS and gossips about available content.

GitHub Stats

  • Repository: spegel-org/spegel
  • Stars: 3,617
  • Last Updated: April 2026
  • Description: “Stateless cluster local OCI registry mirror”

Deployment

Kubernetes (Helm)

1
2
3
4
5
6
7
helm repo add spegel https://spegel-org.github.io/spegel-chart
helm repo update
helm install spegel spegel/spegel \
  --namespace spegel \
  --create-namespace \
  --set spegel.registry.mirrorPort=5000 \
  --set spegel.registry.registries="docker.io,ghcr.io"

Docker Compose (Single Node Test)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
version: "3.8"
services:
  spegel:
    image: ghcr.io/spegel-org/spegel:latest
    network_mode: host
    pid: host
    restart: unless-stopped
    volumes:
      - /var/run/containerd/containerd.sock:/var/run/containerd/containerd.sock
      - /etc/containerd/config.toml:/etc/containerd/config.toml
    environment:
      - SPEGE_REGISTRY_MIRROR_PORT=5000
      - SPEGE_REGISTRY_MIRROR_REGISTRIES=docker.io,ghcr.io
    security_opt:
      - seccomp:unconfined

Pros and Cons

  • Pros: Stateless, zero external dependencies, simple Helm install, native containerd integration, low resource footprint (~50MB RAM per node)
  • Cons: Newer project with smaller community, limited to containerd runtime, no built-in dashboard or monitoring

Kraken (Uber): P2P Docker Registry for Large-Scale Distribution

Kraken (6,683 GitHub stars) is Uber’s open-source P2P Docker registry built for distributing terabytes of data in seconds across massive fleets. It uses a tracker-based architecture where an origin server tracks which peers have which layers.

Architecture

Kraken has three components:

  1. Origin: The authoritative registry that stores all image layers (backed by S3, GCS, or local disk)
  2. Tracker: Maintains a distributed hash table (DHT) mapping image layers to the peers that have them
  3. Peer: Runs on every node, requesting layers from the tracker and downloading from other peers

Unlike Spegel’s gossip-based approach, Kraken uses a centralized tracker for peer coordination. This provides more predictable behavior but adds a dependency on the tracker service.

GitHub Stats

  • Repository: Uber/kraken
  • Stars: 6,683
  • Last Updated: April 2026
  • Description: “P2P Docker registry capable of distributing TBs of data in seconds”

Deployment

Docker Compose

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
version: "3.8"
services:
  tracker:
    image: uber/kraken:latest
    command: kraken tracker --config /etc/kraken/tracker.yaml
    ports:
      - "3000:3000"
    volumes:
      - ./kraken-config:/etc/kraken

  origin:
    image: uber/kraken:latest
    command: kraken origin --config /etc/kraken/origin.yaml
    ports:
      - "3001:3001"
      - "5000:5000"
    volumes:
      - ./kraken-config:/etc/kraken
      - kraken-storage:/data

  peer:
    image: uber/kraken:latest
    command: kraken peer --config /etc/kraken/peer.yaml
    ports:
      - "3002:3002"
    volumes:
      - ./kraken-config:/etc/kraken
    deploy:
      mode: global
    pid: host

volumes:
  kraken-storage:

tracker.yaml Configuration

1
2
3
4
5
6
7
server:
  port: 3000
dht:
  listen_port: 3001
  bootstrap_nodes: []
origin:
  port: 3001

peer.yaml Configuration

1
2
3
4
5
6
7
8
9
server:
  port: 3002
registry:
  proxy_port: 5000
  upstream: "https://registry-1.docker.io"
tracker:
  announce_interval: 30s
  nodes:
    - "tracker:3000"

Pros and Cons

  • Pros: Battle-tested at Uber scale, supports TB-scale distribution, mature project (2019+), S3/GCS backend support
  • Cons: More complex deployment (3 components), requires tracker coordination, Docker Registry v2 protocol only, heavier resource usage

Dragonfly 2: Enterprise-Grade P2P Image Distribution

Dragonfly 2 (3,151 GitHub stars) is a CNCF sandbox project from Alibaba that provides P2P-based data distribution with optional content-addressable filesystem (NYDUS) for accelerated container startup.

Architecture

Dragonfly 2 has four core components:

  1. Scheduler: Central coordinator that assigns download tasks to peers based on topology and load
  2. Peer (dfdaemon): Runs on each node, handles image pull requests and serves layers to other peers
  3. Manager: Administrative interface for configuration, monitoring, and cluster management
  4. NYDUS: Optional container image format that enables lazy pulling (fetch only the blocks needed to start the container)

Dragonfly’s scheduler-based approach provides the most intelligent peer selection, considering network topology, peer load, and data locality.

GitHub Stats

  • Repository: dragonflyoss/Dragonfly2
  • Stars: 3,151
  • Last Updated: April 2026
  • Description: “Efficient, stable, and secure data distribution powered by P2P technology”

Deployment

Docker Compose

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
version: "3.8"
services:
  manager:
    image: dragonflyoss/manager:latest
    ports:
      - "8080:8080"
    environment:
      - DATABASE_TYPE=sqlite
    volumes:
      - manager-data:/var/lib/dragonfly

  scheduler:
    image: dragonflyoss/scheduler:latest
    ports:
      - "8002:8002"
    environment:
      - SCHEDULER_MANAGER_ADDR=manager:8080
    depends_on:
      - manager

  dfdaemon:
    image: dragonflyoss/dfdaemon:latest
    network_mode: host
    pid: host
    environment:
      - DF_SCHEDULER_MANAGER_ADDR=manager:8080
      - DF_DAEMON_REGISTRY_MIRRORS=https://index.docker.io
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - dfdaemon-data:/var/lib/dragonfly
    deploy:
      mode: global

volumes:
  manager-data:
  dfdaemon-data:

Kubernetes (Helm)

1
2
3
4
5
6
7
helm repo add dragonfly https://dragonflyoss.github.io/helm-charts/
helm repo update
helm install dragonfly dragonfly/dragonfly \
  --namespace dragonfly-system \
  --create-namespace \
  --set scheduler.replicaCount=2 \
  --set manager.config.database.type=mysql

Pros and Cons

  • Pros: CNCF project, intelligent scheduler, NYDUS lazy pulling for instant container startup, comprehensive management UI, supports multiple container runtimes (containerd, Docker, CRI-O)
  • Cons: Most complex deployment (4 components), higher resource footprint, NYDUS requires image conversion step, steeper learning curve

Comparison Table

FeatureSpegelKrakenDragonfly 2
GitHub Stars3,6176,6833,151
ArchitectureStateless gossipTracker + DHTScheduler + Manager
Components1 (DaemonSet)3 (Origin, Tracker, Peer)4 (Manager, Scheduler, Peer, NYDUS)
Container Runtimecontainerd onlyDocker Registry v2containerd, Docker, CRI-O
External DependenciesNoneTracker serviceManager + Scheduler
Resource Footprint~50 MB RAM/node~200 MB RAM/node~300 MB RAM/node
Lazy PullingNoNoYes (NYDUS)
Dashboard/UINoNoYes (Manager UI)
CNCF StatusIndependentIndependentCNCF Sandbox
Production MaturityGrowing (2023+)Mature (2019+)Mature (2017+)
Best ForSmall/medium K8s clustersLarge-scale Docker fleetsEnterprise multi-runtime

Performance Benchmarks

Based on published benchmarks and community reports:

  • Spegel: 5-10x faster image pulls in 50+ node clusters. Near-zero registry bandwidth after first pull. Memory overhead ~50MB per node.
  • Kraken: Sub-second distribution of TB-scale images across 1000+ nodes at Uber. Registry bandwidth reduced by 90%+. Tracker adds ~200ms latency for peer discovery.
  • Dragonfly 2: 117x faster container startup with NYDUS lazy pulling (Alibaba benchmarks). Scheduler-aware peer selection reduces cross-datacenter traffic by 80%.

When to Use Each Tool

Choose Spegel if:

  • You run a Kubernetes cluster with containerd
  • You want zero-config, stateless deployment
  • Your cluster is under 200 nodes
  • You prefer simplicity over features

Choose Kraken if:

  • You need to distribute very large images (multi-GB)
  • You already use Docker Registry v2 infrastructure
  • You want battle-tested reliability at scale
  • Your team can manage a 3-component architecture

Choose Dragonfly 2 if:

  • You need the fastest possible container startup (NYDUS)
  • You run multiple container runtimes
  • You want a management dashboard and monitoring
  • You need CNCF-backed tooling for enterprise compliance

Installation Checklist

Before deploying any P2P image distribution tool:

  1. Verify cluster size: P2P benefits kick in at ~10+ nodes pulling the same images
  2. Check runtime compatibility: Ensure your CRI (containerd/Docker/CRI-O) is supported
  3. Test network connectivity: P2P requires open ports between all nodes
  4. Configure registry mirrors: Update CRI config to point to the local P2P endpoint
  5. Monitor initial sync: First pull goes to the upstream registry; subsequent pulls are peer-served
  6. Set up alerting: Monitor peer connectivity and fallback rates to upstream

FAQ

Do P2P image distribution tools replace my container registry?

No. They sit in front of your existing registry (Harbor, Docker Hub, ECR) as a caching layer. The origin registry is still needed for the first pull of each unique layer and for pushing new images.

What happens if a peer goes offline during a P2P download?

All three tools automatically fall back to other peers or the upstream registry. Spegel’s gossip protocol quickly removes dead peers from the routing table. Kraken’s tracker stops announcing offline peers. Dragonfly’s scheduler reassigns tasks to healthy peers.

Can I use these tools with air-gapped or offline environments?

Yes, with configuration. Spegel can seed the initial image set before disconnecting. Kraken’s Origin can be pre-populated with all needed images. Dragonfly supports pre-heating images to the P2P network before deployment.

How much network bandwidth does P2P distribution save?

In typical deployments with 50+ nodes, P2P reduces registry egress bandwidth by 80-95%. The exact savings depend on image reuse patterns — clusters running many replicas of the same image see the highest savings.

Is there a performance penalty for P2P vs direct registry pulls?

For the first node pulling an image, P2P adds negligible overhead (the layer comes from the upstream registry). For subsequent nodes, P2P is significantly faster because layers are served from the local network instead of the central registry. Dragonfly’s NYDUS can even start containers before the full image is downloaded.

Which tool is easiest to set up?

Spegel is the simplest — install via Helm and it works immediately with containerd. Kraken requires configuring three components and a tracker. Dragonfly has the most complex setup with four components but offers the richest feature set.

Do these tools work with private registries requiring authentication?

Yes. Spegel passes through authentication to the upstream registry. Kraken supports registry auth in its peer configuration. Dragonfly supports credential passthrough and can integrate with Harbor’s authentication system.

Can I mix P2P distribution with traditional registry pulls?

Absolutely. All three tools are transparent proxies — if P2P fails or a layer isn’t available from peers, the request automatically falls back to the upstream registry. You can deploy incrementally and monitor fallback rates.