Introduction
In distributed systems, transient failures are inevitable. Network timeouts, database deadlocks, temporary service unavailability, and rate limit responses all require applications to handle failure gracefully. Retry libraries provide structured, configurable mechanisms for retrying failed operations with exponential backoff, jitter, and circuit-breaking patterns.
Manually implementing retry logic with for loops and sleep() calls leads to brittle, error-prone code. Dedicated retry libraries handle edge cases like max retry counts, backoff strategies, retry predicates, and integration with monitoring systems. In this article, we compare four leading open-source retry libraries across different language ecosystems: Tenacity (Python, 8,674 ⭐), Polly (.NET, 14,189 ⭐), Spring Retry (Java, 2,268 ⭐), and retry-go (Go, 2,929 ⭐).
Comparison Table
| Feature | Tenacity (Python) | Polly (.NET) | Spring Retry (Java) | retry-go (Go) |
|---|---|---|---|---|
| Stars | 8,674 | 14,189 | 2,268 | 2,929 |
| Language | Python | C# / .NET | Java | Go |
| Backoff Strategies | fixed, exponential, random, fibonacci | fixed, exponential, linear | fixed, exponential, random | fixed, exponential |
| Jitter Support | Yes (full, equal) | Yes (via DecorrelatedJitter) | Via custom BackOffPolicy | Via custom delay func |
| Retry Predicates | Exception types, result checks | Exception types, result, custom | Exception types, result | Error types, response check |
| Async Support | Native (asyncio) | Native (async/await) | @Async annotation | Native goroutines |
| Callbacks | before_sleep, after_attempt | onRetry, onFallback | RecoveryCallback | OnRetry func |
| Statistics | Built-in (attempt_number) | Via PolicyRegistry | RetryContext | Manual tracking |
| Last Updated | Jun 2026 | Jun 2026 | Jun 2026 | Feb 2026 |
Tenacity — Python’s Retry Powerhouse
Tenacity is the gold standard for retry logic in Python, used by projects ranging from data pipelines to API clients and distributed task queues. It’s a fork of the older retrying library with significant improvements in usability and features.
Key Features:
- Decorator-based API: Minimal code changes to add retry behavior
- Wait strategies:
wait_fixed,wait_exponential,wait_random,wait_chain - Stop conditions:
stop_after_attempt,stop_after_delay,stop_never,stop_all - Retry predicates:
retry_if_exception_type,retry_if_result, custom callables - Before/after hooks:
before_sleep,after_attempt,before_log
Basic Usage:
| |
Async Usage with asyncio:
| |
Tenacity’s combination of decorator simplicity and fine-grained configuration makes it the most productive choice for Python developers. The wait_chain feature is particularly powerful for implementing multi-phase backoff strategies (e.g., retry every 2 seconds for the first 3 attempts, then switch to exponential backoff).
Polly — .NET’s Resilience Swiss Army Knife
Polly is much more than a retry library — it’s a comprehensive resilience and transient-fault-handling library for .NET. In addition to retry, it provides circuit breaker, timeout, bulkhead isolation, fallback, and hedging policies.
Key Features:
- Policy composition: Combine retry with circuit breaker, timeout, and fallback
- Policy registry: Centralized management of named policies with metrics
- Fluent API: Intuitive builder pattern for policy construction
- HttpClient integration: Native
IHttpClientFactoryextensions - Distributed caching: Polly caching policy with pluggable cache providers
Basic Usage:
| |
Combining Policies (Retry + Circuit Breaker):
| |
Polly’s policy composition architecture is its killer feature: you can layer retry, circuit breaker, timeout, and fallback policies into a single coherent resilience strategy. This makes Polly the best choice for .NET applications that need comprehensive fault tolerance beyond simple retries.
Spring Retry — Java’s Declarative Approach
Spring Retry integrates retry capabilities directly into the Spring Framework ecosystem. It supports both annotation-based declarative retry and programmatic RetryTemplate usage.
Key Features:
- @Retryable annotation: Declarative retry on Spring-managed beans
- @Recover fallback: Automatic fallback method invocation after exhaustion
- BackOffPolicy implementations: ExponentialBackOffPolicy, FixedBackOffPolicy, UniformRandomBackOffPolicy
- RetryTemplate: Programmatic retry with callback-based API
- Spring Boot auto-configuration: Zero-config setup in Spring Boot applications
Basic Usage — Declarative:
| |
Programmatic Usage with RetryTemplate:
| |
Spring Retry’s tight integration with the Spring ecosystem is its main advantage: @Retryable works seamlessly with @Transactional, @Async, and Spring’s AOP proxy system. The @Recover annotation provides a clean fallback pattern that keeps error handling separate from business logic.
retry-go — Go’s Minimalist Solution
retry-go delivers retry capabilities with Go’s characteristic minimalism. It’s a single-file library with zero external dependencies, focused on doing one thing well.
Key Features:
- Functional options pattern: Clean, extensible configuration via variadic options
- Context support: Native
context.Contextintegration for cancellation and deadlines - Delay type flexibility:
FixedDelay,BackOffDelay,CombineDelay - Max jitter: Built-in jitter to prevent thundering herd
- onRetry callback: Logging and metrics integration
Basic Usage:
| |
Retry with Result Caching:
| |
retry-go’s functional options pattern is idiomatic Go and integrates naturally with Go’s error handling conventions. The library is small enough (~300 lines) to vendor directly into your project, making it ideal for teams that prefer dependency minimization.
Choosing the Right Retry Library
- Choose Tenacity if: You’re building Python services, data pipelines, or async applications. Tenacity’s decorator API is the most ergonomic and its async support is first-class.
- Choose Polly if: You’re on .NET and need a complete resilience strategy (not just retry). Polly’s policy composition enables retry + circuit breaker + fallback in one coherent pipeline.
- Choose Spring Retry if: You’re in the Spring/Java ecosystem and want declarative, annotation-driven retry with automatic
@Recoverfallback methods. The Spring Boot auto-configuration makes setup trivial. - Choose retry-go if: You want a lightweight, dependency-free Go library that follows Go idioms. Its functional options pattern and context support make it feel native to the language.
For related resilience patterns, see our guides on circuit breaker libraries for preventing cascading failures, and rate limiter implementations for upstream protection. For broker-based resilience, our message broker HA guide covers retry-aware messaging patterns.
FAQ
What’s the difference between retry and circuit breaker?
Retry re-attempts a failed operation immediately or after a short delay, assuming the failure is transient (network blip, brief overload). Circuit breaker stops calling a failing service entirely after a threshold of failures, preventing cascading failures and giving the downstream service time to recover. They’re complementary — use retry for transient errors and circuit breaker for systemic failures.
How do I prevent retry storms (thundering herd)?
Use jitter — random variation in the delay between retries. All four libraries support jitter. Without it, multiple clients hitting the same backoff schedule will synchronize their retries, overloading the recovering service. With jitter, retries are spread across time, giving the service breathing room.
Should I retry on all errors or only specific ones?
Retry only on transient errors (network timeouts, 429 Too Many Requests, 503 Service Unavailable, temporary deadlocks). Never retry on permanent errors (400 Bad Request, 401 Unauthorized, validation failures) — retrying won’t fix them and wastes resources. All four libraries support error type filtering to enforce this distinction.
How many retry attempts should I configure?
Start with 3 attempts (1 initial + 2 retries) for user-facing operations and 5 for background tasks. More than 5 retries usually indicates a systemic problem that retry can’t solve — circuit breaker or alerting should kick in. Configure maxDelay to cap the total retry duration (e.g., 30-60 seconds).
Can I persist retry state across application restarts?
Tenacity and Polly don’t persist retry state natively — they’re in-memory. For durable retry across restarts, use a message queue (RabbitMQ, NATS) or task queue (Celery, BullMQ) with built-in retry mechanisms. Spring Retry can be combined with Spring Batch for checkpointed retry in long-running jobs.
How do I monitor retry behavior in production?
All four libraries provide hooks for logging and metrics. In production, track: retry attempt count per operation (spiking counts indicate growing instability), success-after-retry rate (healthy if >90%), and max-attempts-exhausted events (actionable failures). Export to Prometheus/Grafana or your APM tool for dashboards and alerting.
💰 想测试你的市场判断力?我用 Polymarket 做预测市场交易——这是全球最大的预测市场平台,从大选结果到技术监管时间线,什么都可以押注。和赌博不同,这是真正的信息市场:你懂的信息越多,胜率越高。我靠预测技术相关事件的走向已经赚了不少。用我的邀请链接注册:Polymarket.com