Self-Hosted Distributed Locking: etcd vs ZooKeeper vs Consul vs Redis 2026

When multiple services need coordinated access to a shared resource, a distributed lock is the answer. Whether you’re running scheduled jobs across kubernetes pods, managing leader election in a microservices cluster, or preventing double-processing in a distributed queue, picking the right locking primitive matters.

This guide compares the four most popular self-hosted distributed locking solutions: etcd, Apache ZooKeeper, HashiCorp Consul, and Redis (Redlock algorithm). We’ll cover how each works, provide docker deployment configs, and help you choose the right tool for your infrastructure.

Why You Need Distributed Locking

In a single-server application, a mutex or file lock solves contention easily. But when you scale to multiple nodes, local locks become meaningless — two processes on different servers can both think they hold “the lock.” Distributed locking solves this by providing a single source of truth that all nodes agree on.

Common use cases include:

Leader election — ensuring only one instance of a service performs a critical task
Job scheduling — preventing duplicate cron jobs across multiple workers
Rate limiting — coordinating API rate limits across a cluster of application servers
Resource provisioning — ensuring only one process creates a database schema or allocates an IP
Ordering guarantees — serializing writes to shared state in event-driven architectures

If you’re already running distributed systems, you likely need distributed locking. The question is which tool to use — and the answer depends on your existing infrastructure and consistency requirements.

How Distributed Locks Work (The Theory)

Before comparing tools, it’s worth understanding the core pattern. A distributed lock needs three guarantees:

Mutual exclusion — only one holder at a time
Deadlock freedom — locks are eventually released (usually via TTL/lease)
Fault tolerance — the system survives node failures without corrupting lock state

Most implementations use one of two approaches:

Compare-and-swap (CAS) — write a value only if the key doesn’t exist (etcd, Consul)
Ephemeral nodes — create a temporary node that disappears when the client disconnects (ZooKeeper)

Redis uses a variant called Redlock, which acquires locks on multiple independent Redis instances to reduce the probability of split-brain scenarios.

Let’s look at each tool in detail.

etcd — The Kubernetes-Native Lock Store

GitHub: etcd-io/etcd · ⭐ 51,600+ · Updated: April 2026 · Language: Go

etcd is a strongly consistent, distributed key-value store based on the Raft consensus algorithm. It’s best known as the backing store for Kubernetes, and its lock primitives are battle-tested at massive scale.

How etcd Locks Work

etcd provides built-in concurrency primitives through its concurrency package. Locks are implemented as lease-key pairs:

Create a lease with a TTL (time-to-live)
Attempt to write a key with CreateRevision == 0 (meaning it doesn’t exist)
If the write succeeds, you hold the lock
The lease auto-expires if your process crashes, preventing deadlocks

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
import "go.etcd.io/etcd/client/v3/concurrency"

// Create a session with a 10-second lease TTL
sess, _ := concurrency.NewSession(client, concurrency.WithTTL(10))
defer sess.Close()

// Acquire a lock on "my-resource"
mutex := concurrency.NewMutex(sess, "/locks/my-resource")
mutex.Lock(context.TODO())
// Critical section — only one holder at a time
mutex.Unlock(context.TODO())

Docker Deployment

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
version: "3.8"
services:
  etcd1:
    image: bitnami/etcd:latest
    environment:
      - ALLOW_NONE_AUTHENTICATION=yes
      - ETCD_NAME=etcd1
      - ETCD_INITIAL_ADVERTISE_PEER_URLS=http://etcd1:2380
      - ETCD_LISTEN_PEER_URLS=http://0.0.0.0:2380
      - ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379
      - ETCD_ADVERTISE_CLIENT_URLS=http://etcd1:2379
      - ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster
      - ETCD_INITIAL_CLUSTER=etcd1=http://etcd1:2380,etcd2=http://etcd2:2380,etcd3=http://etcd3:2380
      - ETCD_INITIAL_CLUSTER_STATE=new
    ports:
      - "2379:2379"
      - "2380:2380"

  etcd2:
    image: bitnami/etcd:latest
    environment:
      - ALLOW_NONE_AUTHENTICATION=yes
      - ETCD_NAME=etcd2
      - ETCD_INITIAL_ADVERTISE_PEER_URLS=http://etcd2:2380
      - ETCD_LISTEN_PEER_URLS=http://0.0.0.0:2380
      - ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379
      - ETCD_ADVERTISE_CLIENT_URLS=http://etcd2:2379
      - ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster
      - ETCD_INITIAL_CLUSTER=etcd1=http://etcd1:2380,etcd2=http://etcd2:2380,etcd3=http://etcd3:2380
      - ETCD_INITIAL_CLUSTER_STATE=new
    ports:
      - "2381:2379"
      - "2382:2380"

  etcd3:
    image: bitnami/etcd:latest
    environment:
      - ALLOW_NONE_AUTHENTICATION=yes
      - ETCD_NAME=etcd3
      - ETCD_INITIAL_ADVERTISE_PEER_URLS=http://etcd3:2380
      - ETCD_LISTEN_PEER_URLS=http://0.0.0.0:2380
      - ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379
      - ETCD_ADVERTISE_CLIENT_URLS=http://etcd3:2379
      - ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster
      - ETCD_INITIAL_CLUSTER=etcd1=http://etcd1:2380,etcd2=http://etcd2:2380,etcd3=http://etcd3:2380
      - ETCD_INITIAL_CLUSTER_STATE=new
    ports:
      - "2383:2379"
      - "2384:2380"

Deploy with docker compose up -d and connect to etcd1:2379. The three-node cluster tolerates one node failure.

Apache ZooKeeper — The OG Distributed Coordinator

GitHub: apache/zookeeper · ⭐ 12,700+ · Updated: April 2026 · Language: Java

ZooKeeper is the original distributed coordination service, developed at Yahoo and now an Apache top-level project. It uses a hierarchical namespace (like a filesystem) with ephemeral and sequential znodes to implement locks.

How ZooKeeper Locks Work

ZooKeeper’s lock recipe uses ephemeral sequential znodes:

Create an ephemeral sequential znode under a lock path (e.g., /lock/lock-00001)
Get all children of the lock path and sort them
If your znode has the lowest sequence number, you hold the lock
Otherwise, watch the znode immediately before yours
When that znode is deleted (client disconnects or releases), retry

This approach is elegant because ZooKeeper automatically cleans up ephemeral nodes when a client session ends — no TTL management needed.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
import org.apache.curator.framework.recipes.locks.InterProcessMutex;

CuratorFramework client = CuratorFrameworkFactory.newClient(
    "zk1:2181,zk2:2181,zk3:2181", new ExponentialBackoffRetry(1000, 3));
client.start();

InterProcessMutex lock = new InterProcessMutex(client, "/locks/my-resource");
lock.acquire();
try {
    // Critical section
} finally {
    lock.release();
}

The Curator library (from Apache) provides a production-ready implementation — don’t write ZooKeeper lock logic from scratch.

Docker Deployment

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
version: "3.8"
services:
  zk1:
    image: zookeeper:3.9
    environment:
      ZOO_MY_ID: 1
      ZOO_SERVERS: server.1=zk1:2888:3888;2181 server.2=zk2:2888:3888;2181 server.3=zk3:2888:3888;2181
    ports:
      - "2181:2181"

  zk2:
    image: zookeeper:3.9
    environment:
      ZOO_MY_ID: 2
      ZOO_SERVERS: server.1=zk1:2888:3888;2181 server.2=zk2:2888:3888;2181 server.3=zk3:2888:3888;2181
    ports:
      - "2182:2181"

  zk3:
    image: zookeeper:3.9
    environment:
      ZOO_MY_ID: 3
      ZOO_SERVERS: server.1=zk1:2888:3888;2181 server.2=zk2:2888:3888;2181 server.3=zk3:2888:3888;2181
    ports:
      - "2183:2181"

HashiCorp Consul — Service Mesh with Built-In Locking

GitHub: hashicorp/consul · ⭐ 29,800+ · Updated: April 2026 · Language: Go

Consul is primarily a service discovery and service mesh tool, but its KV store includes a built-in locking mechanism using sessions. If you already run Consul for service mesh, you get distributed locking for free.

How Consul Locks Work

Consul locks are built on sessions:

Create a session with a TTL and behavior delete (lock released on session expiry)
Attempt a KV PUT with ?acquire=<session-id> — this is atomic
If the return value is true, you hold the lock
Release with ?release=<session-id> or let the TTL expire

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Create a session
SESSION_ID=$(curl -s -X PUT http://localhost:8500/v1/session/create \
  -d '{"Name": "my-lock", "Behavior": "delete", "TTL": "30s"}' | jq -r '.ID')

# Acquire lock
curl -s -X PUT "http://localhost:8500/v1/kv/locks/my-resource?acquire=$SESSION_ID"
# Returns: true if acquired, false if held by another session

# Release lock
curl -s -X PUT "http://localhost:8500/v1/kv/locks/my-resource?release=$SESSION_ID"

Docker Deployment

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
version: "3.8"
services:
  consul1:
    image: hashicorp/consul:latest
    command: "agent -server -bootstrap-expect=3 -ui -bind=0.0.0.0 -client=0.0.0.0"
    environment:
      - CONSUL_BIND_ADDRESS=0.0.0.0
    ports:
      - "8500:8500"
      - "8600:8600/udp"
    volumes:
      - consul1-data:/consul/data

  consul2:
    image: hashicorp/consul:latest
    command: "agent -server -join=consul1 -bind=0.0.0.0 -client=0.0.0.0"
    environment:
      - CONSUL_BIND_ADDRESS=0.0.0.0
    ports:
      - "8501:8500"
    volumes:
      - consul2-data:/consul/data

  consul3:
    image: hashicorp/consul:latest
    command: "agent -server -join=consul1 -bind=0.0.0.0 -client=0.0.0.0"
    environment:
      - CONSUL_BIND_ADDRESS=0.0.0.0
    ports:
      - "8502:8500"
    volumes:
      - consul3-data:/consul/data

volumes:
  consul1-data:
  consul2-data:
  consul3-data:

Consul also provides a beautiful web UI at http://localhost:8500 where you can inspect KV locks, sessions, and service health.

Redis Redlock — High-Speed Locking with Trade-Offs

GitHub: redis/redis · ⭐ 73,800+ · Updated: April 2026 · Language: C

Redis is the most popular in-memory data store, and its Redlock algorithm provides distributed locking with sub-millisecond latency. The trade-off: Redlock sacrifices some consistency guarantees for raw speed.

How Redis Redlock Works

The Redlock algorithm (described by Salvatore Sanfilippo) works by:

Attempt to acquire the lock on N independent Redis instances (usually 5)
Use a small timeout (much less than the lock TTL) for each attempt
If you acquire the lock on a majority (N/2 + 1) instances, you hold it
The effective lock TTL is reduced by the time spent acquiring

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
import redis
from redis.lock import Lock

# Single Redis instance (simplified — production Redlock uses 5 instances)
client = redis.Redis(host="localhost", port=6379, decode_responses=True)

# Acquire a lock with 10-second TTL
lock = Lock(client, "my-resource", timeout=10)
if lock.acquire(blocking=False):
    try:
        # Critical section
        pass
    finally:
        lock.release()
else:
    print("Lock held by another process")

For production Redlock, use the Redlock class from redis-py which handles multiple instances automatically.

Docker Deployment

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
version: "3.8"
services:
  redis1:
    image: redis:7-alpine
    command: ["redis-server", "--save", "", "--appendonly", "no"]
    ports:
      - "6379:6379"

  redis2:
    image: redis:7-alpine
    command: ["redis-server", "--save", "", "--appendonly", "no"]
    ports:
      - "6380:6379"

  redis3:
    image: redis:7-alpine
    command: ["redis-server", "--save", "", "--appendonly", "no"]
    ports:
      - "6381:6379"

  redis4:
    image: redis:7-alpine
    command: ["redis-server", "--save", "", "--appendonly", "no"]
    ports:
      - "6382:6379"

  redis5:
    image: redis:7-alpine
    command: ["redis-server", "--save", "", "--appendonly", "no"]
    ports:
      - "6383:6379"

For production, add --appendonly yes and configure min-replicas-to-write to ensure data survives restarts. See our distributed caching guide with Redis alternatives for a deeper look at Redis deployment patterns.

Comparison: Which Distributed Lock Should You Use?

Feature	etcd	ZooKeeper	Consul	Redis (Redlock)
Consensus	Raft	ZAB	Raft + gossip	None (Redlock)
Language	Go	Java	Go	C
GitHub Stars	51,600+	12,700+	29,800+	73,800+
Lock Primitive	Lease + CAS	Ephemeral znodes	Session + KV acquire	SET NX + expiry
Strong Consistency	Yes	Yes	Yes	No (eventual)
Lock TTL	Yes (lease)	Yes (ephemeral)	Yes (session)	Yes (EXPIRE)
Auto-Release on Crash	Yes	Yes	Yes	Yes (with TTL)
Latency (local)	~5ms	~10ms	~5ms	<1ms
Docker Image	`bitnami/etcd`	`zookeeper:3.9`	`hashicorp/consul`	`redis:7-alpine`
Cluster Size	3-5 nodes	3-5 nodes	3-5 nodes	5 instances (Redlock)
Best For	Kubernetes ecosystems	Hadoop/big data stacks	Service mesh users	High-throughput apps
Resource Usage	Low (~100MB)	High (~500MB JVM)	Low (~150MB)	Low (~50MB)
TLS Support	Yes	Yes	Yes	Yes

Decision Matrix

Choose etcd if you run Kubernetes — it’s already in your cluster, and the Go client is kafkalent.
Choose ZooKeeper if you run Hadoop, Kafka, or other JVM-based distributed systems.
Choose Consul if you already use it for service discovery and want a unified coordination layer.
Choose Redis Redlock if you need maximum throughput and can tolerate weaker consistency guarantees (e.g., rate limiting, cache stampede prevention).

For a deeper look at how etcd, Consul, and ZooKeeper compare as service discovery platforms, see our etcd vs Consul vs ZooKeeper service discovery guide.

Practical Lock Patterns

Leader Election Pattern

1
2
3
4
5
# etcd leader election
ETCDCTL_API=3 etcdctl lease grant 30   # 30-second lease
# Use the lease ID to create an election key
# The node holding the key is the leader
# If the leader crashes, the lease expires and another node takes over

Distributed Cron Pattern

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Consul-based distributed cron
SESSION=$(curl -s -X PUT http://consul:8500/v1/session/create \
  -d '{"TTL": "60s", "Behavior": "delete"}' | jq -r '.ID')

ACQUIRED=$(curl -s -X PUT \
  "http://consul:8500/v1/kv/cron/daily-backup?acquire=$SESSION" | grep true)

if [ "$ACQUIRED" = "true" ]; then
    echo "Running backup..."
    # Run the job
    curl -s -X PUT "http://consul:8500/v1/kv/cron/daily-backup?release=$SESSION"
else
    echo "Another node is running the backup, skipping."
fi

Redis Semaphore Pattern

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
import redis

client = redis.Redis(host="localhost", port=6379)

# Semaphore: allow at most 3 concurrent workers
semaphore = client.pipeline()
semaphore.set("semaphore:worker", "1", nx=True, ex=30)
result = semaphore.execute()

if result[0]:
    print("Acquired semaphore slot")
    # Do work
    client.delete("semaphore:worker")
else:
    print("No slots available, try again later")

Monitoring and Debugging Locks

Whichever tool you choose, monitoring lock health is critical:

etcd: Use etcdctl endpoint status to check cluster health; watch for lease expiration metrics
ZooKeeper: Monitor zk_avg_latency, zk_outstanding_requests, and watch count via the mntr four-letter command
Consul: Check the /v1/health/state/critical endpoint and monitor session TTL expirations
Redis: Track blocked_clients (clients waiting for locks) and expired_keys metrics

For organizations building distributed systems with high-availability requirements, understanding lock behavior is as important as choosing the right tool. See our database high availability guide for related patterns on distributed coordination.

FAQ

What is the difference between a mutex and a distributed lock?

A mutex (mutual exclusion) works within a single process or machine using shared memory. A distributed lock coordinates across multiple machines over a network, requiring a consensus or coordination protocol to ensure only one holder exists at any time.

Is Redis Redlock safe for production use?

Redlock is debated in the distributed systems community. Martin Kleppmann published a well-known critique showing that Redlock can fail under certain clock-skew and garbage-collection scenarios. For use cases where correctness is critical (e.g., financial transactions), prefer etcd, ZooKeeper, or Consul which provide strong consistency guarantees. For rate limiting or cache stampede prevention, Redlock is perfectly adequate.

Can I use PostgreSQL for distributed locking?

Yes — PostgreSQL supports advisory locks (pg_advisory_lock) which work well for small clusters. However, PostgreSQL isn’t designed as a coordination service: it has higher latency than purpose-built tools, requires careful connection pooling, and doesn’t handle node failures as gracefully as Raft-based systems.

How do I prevent lock contention from becoming a bottleneck?

Design your system to minimize lock scope and duration. Use coarse-grained locks (lock a whole operation, not individual steps), keep critical sections short, and consider lock-free alternatives like partitioning (assign each resource to a specific node). If you’re seeing high contention, measure whether your chosen tool’s latency matches your throughput requirements.

What happens if the lock holder crashes without releasing the lock?

All four tools handle this automatically: etcd leases expire, ZooKeeper ephemeral nodes are deleted on session disconnect, Consul sessions time out, and Redis keys expire via TTL. The key is to set an appropriate TTL — long enough to complete the critical section, but short enough to recover from crashes quickly.

Should I run a dedicated cluster just for distributed locking?

If you already run etcd (for Kubernetes), ZooKeeper (for Kafka), or Consul (for service mesh), reuse the existing cluster — the marginal cost is negligible. If you need locking as a standalone service and don’t have any of these, a small Redis or etcd cluster is the lightest option.

Why You Need Distributed Locking

How Distributed Locks Work (The Theory)

etcd — The Kubernetes-Native Lock Store

How etcd Locks Work

Docker Deployment

Apache ZooKeeper — The OG Distributed Coordinator

How ZooKeeper Locks Work

Docker Deployment

HashiCorp Consul — Service Mesh with Built-In Locking

How Consul Locks Work

Docker Deployment

Redis Redlock — High-Speed Locking with Trade-Offs

How Redis Redlock Works

Docker Deployment

Comparison: Which Distributed Lock Should You Use?

Decision Matrix

Practical Lock Patterns

Leader Election Pattern

Distributed Cron Pattern

Redis Semaphore Pattern

Monitoring and Debugging Locks

FAQ

What is the difference between a mutex and a distributed lock?

Is Redis Redlock safe for production use?

Can I use PostgreSQL for distributed locking?

How do I prevent lock contention from becoming a bottleneck?

What happens if the lock holder crashes without releasing the lock?

Should I run a dedicated cluster just for distributed locking?

Related Posts

Ceph vs GlusterFS vs MooseFS: Best Self-Hosted Distributed Storage 2026

Anthias vs Xibo vs ScreenLite: Best Self-Hosted Digital Signage 2026

Apache Iceberg vs Apache Hudi vs Delta Lake: Best Open Data Lakehouse Formats 2026