Running Kubernetes across multiple clusters is increasingly common — for disaster recovery, geographic distribution, or separating workloads by team or environment. But connecting services across cluster boundaries introduces networking complexity. This guide compares three approaches to multi-cluster service connectivity: Cilium ClusterMesh, Istio Multi-Primary, and Linkerd Multicluster.

What Is Multi-Cluster Service Connectivity?

Multi-cluster service connectivity allows services running in one Kubernetes cluster to discover and communicate with services in another cluster, as if they were on the same network. Key requirements include:

  • Service discovery — DNS resolution for services across cluster boundaries
  • Secure communication — mTLS or encrypted tunnels between clusters
  • Load balancing — distributing traffic across cluster endpoints
  • Failover — automatic routing when a cluster becomes unavailable

Comparison Overview

FeatureCilium ClusterMeshIstio Multi-PrimaryLinkerd Multicluster
GitHub Stars24,300+38,200+11,300+
ApproacheBPF-based L3/L4Envoy proxy sidecarLightweight proxy
Cross-cluster DNSCoreDNS with etcd syncIstio DNS federationmirror services
mTLSAutomatic (SPIFFE)Automatic (Istio CA)Automatic (Linkerd CA)
Service DiscoveryGlobal service catalogFederated control planeService mirror controller
FailoverAutomatic (via eBPF)Via traffic policiesManual (failover script)
Network OverlayeBPF (no overlay)Envoy meshLinkerd proxy
ComplexityMediumHighLow
Best ForeBPF-native clustersFull service mesh needsSimple, lightweight setups

Cilium ClusterMesh: eBPF-Based Connectivity

Cilium ClusterMesh uses eBPF to create a transparent network overlay between clusters. Services are automatically discovered and load-balanced using Cilium’s global service catalog, synchronized via etcd.

Architecture

Each cluster runs Cilium CNI with ClusterMesh enabled. A shared etcd cluster (or etcd instances connected via tunnel) synchronizes service endpoints. eBPF programs handle transparent service routing at the kernel level — no sidecar proxies required.

Installation

1
2
3
# Install Cilium with ClusterMesh enabled
# helm repo add cilium https://helm.cilium.io/
# helm install cilium cilium/cilium --namespace kube-system #   --set cluster.id=1 #   --set cluster.name=cluster1 #   --set clustermesh.enable=true #   --set clustermesh.apiserver.tls.auto.certValidityDuration=87600h

Docker Compose (for etcd ClusterMesh gateway testing)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
version: "3.8"
services:
  clustermesh-apiserver:
    image: quay.io/cilium/clustermesh-apiserver:v1.16.0
    ports:
      - "2379:2379"
      - "2380:2380"
    environment:
      - CLUSTERMESH_CONFIG=/etc/clustermesh
    volumes:
      - ./clustermesh-config:/etc/clustermesh
      - ./certs:/var/lib/cilium/clustermesh/certs

Key Features

  • Global Service Discovery: Services are automatically synchronized across clusters via etcd
  • No Sidecar Overhead: eBPF handles routing at kernel level — zero proxy overhead
  • Network Policies: Cilium’s L3-L7 network policies work seamlessly across clusters
  • Transparent to Applications: No code changes required — standard Kubernetes DNS works

Istio Multi-Primary: Full Mesh Across Clusters

Istio Multi-Primary extends Istio’s service mesh across multiple clusters. Each cluster runs its own Istio control plane, and the meshes are connected via shared trust roots and endpoint synchronization.

Architecture

Each cluster has a full Istio installation (Istiod + Envoy sidecars). Control planes are connected through a shared root certificate. Services communicate via Envoy proxies with automatic mTLS.

Installation

1
2
# Install Istio multi-primary using istioctl
# istioctl install --set profile=default #   --set values.global.meshID=mesh1 #   --set values.global.multiCluster.clusterName=cluster1 #   --set values.global.network=network1

Envoy Sidecar Configuration

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: external-svc
spec:
  hosts:
    - svc.cluster.remote
  location: MESH_INTERNAL
  ports:
    - number: 80
      name: http
      protocol: HTTP
  resolution: DNS
  endpoints:
    - address: 10.0.0.1
      locality: region/zone/cluster2

Key Features

  • Full Service Mesh: Traffic management, observability, and security across clusters
  • Automatic mTLS: All cross-cluster traffic is encrypted by default
  • Advanced Routing: Canary deployments, traffic splitting, and fault injection across clusters
  • High Overhead: Each pod runs an Envoy sidecar (~100MB memory, ~5-10ms latency)

Linkerd Multicluster: Lightweight Approach

Linkerd’s multicluster takes a simpler approach. A service mirror controller watches services in the remote cluster and creates “mirror” services locally. Traffic is routed through a lightweight Linkerd proxy.

Architecture

Each cluster runs Linkerd. The linkerd-multicluster component installs a service mirror controller that watches services exported from the remote cluster and creates mirror services in the local cluster.

Installation

1
2
3
4
# Install Linkerd multicluster
# linkerd install --crds | kubectl apply -f -
# linkerd install | kubectl apply -f -
# linkerd multicluster install | kubectl apply -f -
1
2
3
4
# Link two clusters together
# linkerd multicluster link --cluster-name cluster2 | kubectl apply -f -
# On cluster1, export a service
# kubectl annotate svc my-service mirror.linkerd.io/exported=true

Docker Compose (for local testing)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
version: "3.8"
services:
  linkerd-controller:
    image: cr.l5d.io/linkerd/controller:stable-2.16.0
    ports:
      - "8443:8443"
      - "9995:9995"
    environment:
      - LINKERD_CLUSTER_NAME=test-cluster
    volumes:
      - ./linkerd-config:/var/lib/linkerd/config

Key Features

  • Minimal Overhead: Linkerd’s proxy is written in Rust — lower memory and latency than Envoy
  • Simple Setup: linkerd multicluster link command handles the entire connection
  • Service Mirroring: Exported services appear as local services with DNS resolution
  • No Traffic Management: Linkerd multicluster doesn’t support advanced traffic policies across clusters

Why Use Multi-Cluster Service Connectivity?

Running services across multiple Kubernetes clusters provides several critical advantages that single-cluster deployments cannot match.

Disaster recovery is the most compelling use case. When a cluster goes down — whether from a cloud provider outage, misconfiguration, or hardware failure — multi-cluster connectivity enables automatic failover to a healthy cluster. This is essential for services with strict SLA requirements where even minutes of downtime are unacceptable.

Geographic distribution reduces latency for globally distributed user bases. By running clusters in multiple regions and using multi-cluster connectivity, requests can be served from the nearest cluster while still maintaining a unified service catalog. For compliance and data residency requirements, multi-cluster setups allow you to keep user data in specific regions while maintaining centralized service discovery.

Team and workload isolation is another key driver. Organizations often separate clusters by team, environment, or workload type (e.g., production vs. staging). Multi-cluster connectivity allows these isolated clusters to communicate when needed — for example, a frontend service in one cluster calling a backend API in another — without sharing the same control plane.

For teams evaluating broader multi-cluster orchestration, our Karmada vs Liqo vs Submariner comparison covers federation-level management, while our service mesh observability guide explores monitoring across mesh deployments.

Choosing the Right Multi-Cluster Approach

ScenarioRecommended Tool
eBPF-capable clustersCilium ClusterMesh — kernel-level performance
Full service mesh with observabilityIstio Multi-Primary — complete feature set
Minimal overhead, simple setupLinkerd Multicluster — lightweight proxy
Existing Cilium CNICilium ClusterMesh — natural extension
Existing Istio deploymentIstio Multi-Primary — reuse existing mesh
No service mesh yetLinkerd Multicluster — easiest to adopt

FAQ

What is the difference between multi-cluster networking and multi-cluster federation?

Multi-cluster networking (the focus of this article) connects services across independent clusters so they can communicate. Multi-cluster federation (like KubeFed or Karmada) manages resources across clusters from a central control plane — deploying, scaling, and scheduling workloads. Networking focuses on connectivity; federation focuses on orchestration.

Can I mix different CNI plugins across clusters?

For Cilium ClusterMesh, all clusters must run Cilium as the CNI. For Istio and Linkerd, the underlying CNI can differ — Istio and Linkerd operate at the service mesh layer, above the CNI. However, using the same CNI across clusters simplifies troubleshooting and configuration.

How does service discovery work across clusters?

Each tool handles DNS differently. Cilium synchronizes service endpoints via etcd, making services appear in CoreDNS globally. Istio uses federated control planes that share endpoint information. Linkerd creates mirror services — when you export a service from cluster B, a mirror service with the same DNS name appears in cluster A.

What is the performance overhead of each approach?

Cilium ClusterMesh has the lowest overhead because it uses eBPF at the kernel level — no proxy process per pod. Istio adds ~100MB memory and 5-10ms latency per Envoy sidecar. Linkerd’s Rust-based proxy adds ~50MB memory and 2-5ms latency per pod, making it lighter than Istio but heavier than Cilium.

Can I use multi-cluster connectivity for database replication?

Yes, but with caveats. Multi-cluster service connectivity provides network-level connectivity — it does not handle application-level replication. You would use the connectivity to link database replicas (e.g., PostgreSQL streaming replication) across clusters, but the replication logic is handled by the database itself, not the networking layer.

How do I handle secrets across clusters?

Multi-cluster connectivity does not automatically synchronize secrets. You need a separate secrets management solution — see our secrets rotation guide for options like Vault or External Secrets Operator that can distribute secrets across clusters.