Why Self-Host Your Incident Management?
Commercial on-call and incident management platforms like PagerDuty, Opsgenie, and VictorOps have become prohibitively expensive for small teams and homelab operators. A basic PagerDuty plan starts at $21/user/month and quickly escalates once you need features like incident workflows, status pages, and advanced escalation policies. For a team of five, that is over $1,200 per year — money that could go toward better hardware instead.
Self-hosting your incident management stack solves multiple problems at once:
- No per-user pricing — run the software for your entire team without counting seats
- Unlimited alert volume — no artificial caps on how many incidents you can process
- Complete data ownership — incident histories, response times, and post-mortem notes never leave your infrastructure
- Deep integration flexibility — wire in any monitoring tool via webhooks without paying for premium connectors
- On-prem notification routing — route alerts through internal Gotify, Ntfy, or Matrix instances that external SaaS tools cannot reach
- Custom escalation logic — build multi-tier routing policies that match your exact organizational structure
Whether you manage a production kubernetes cluster, a homelab with dozens of services, or a small business IT stack, self-hosted incident management gives you enterprise-grade alerting without the enterprise-grade price tag. This guide covers the three most capable open-source options available in 2026.
What Is Incident Management and On-Call Alerting?
Incident management is the process of detecting, responding to, and resolving service disruptions. A complete incident management system handles three core functions:
- Alert aggregation — collect alerts from multiple monitoring tools (Prometheus, Zabbix, Uptime Kuma, custom scripts) into a single pane of glass
- On-call rotation — automatically route alerts to the right person based on schedules, escalation policies, and team assignments
- Incident lifecycle tracking — log when incidents start, who acknowledges them, what actions were taken, and when they resolve
On-call alerting is the mechanism that ensures someone is always available to respond. It manages rotation schedules (daily, weekly, custom), escalation chains (if the primary person does not respond within 15 minutes, escalate to the secondary), and notification delivery (phone call, SMS, push notification, email, chat message).
Without proper incident management, alerts get lost in Slack channels, nobody knows who is on call, and critical outages go unnoticed until a customer complains. Self-hosted solutions give you the same functionality as PagerDuty and Opsgenie, but running on your own infrastructure.
Feature Comparison: Grafana OnCall vs Alerta vs OpenDuty
| Feature | Grafana OnCall | Alerta | OpenDuty |
|---|---|---|---|
| License | Apache 2.0 | Apache 2.0 | MIT |
| Language | Python (Django) + React | Python (Flask) + Angular/React | Python (Django) |
| GitHub Stars | 1,200+ | 1,400+ | 200+ |
| Latest Release | Actively developed | Actively developed | Community maintained |
| On-Call Schedules | ✅ Full rotation management | ⚠️ Manual assignment | ✅ Basic rotations |
| Escalation Policies | ✅ Multi-tier, time-based | ✅ Rule-based routing | ⚠️ Limited |
| Alert Deduplication | ✅ Automatic | ✅ Correlation engine | ❌ None |
| Alert Sources | 30+ integrations | 50+ integrations | ~10 integrations |
| Slack Integration | ✅ Native | ✅ Native | ❌ |
| Telegram Integration | ✅ Native | ⚠️ Via webhook | ❌ |
| Phone/SMS Calls | ✅ Via Twilio, Vonage | ⚠️ Via plugins | ⚠️ Via Twilio |
| Mobile App | ✅ iOS and Android | ❌ Web only | ❌ |
| API | ✅ REST | ✅ REST | ✅ REST |
| Multi-Tenant | ✅ Teams and organizations | ✅ Customers and environments | ❌ |
| Post-Mortem/Blameless | ✅ Built-in notes | ⚠️ Via annotations | ❌ |
| Status Page | ⚠️ Via Grafana | ❌ | ❌ |
| docker Support | ✅ Official images | ✅ Official images | Community images |
| Prometheus Integration | ✅ Native | ✅ Native | ❌ |
| Database | PostgreSQL, MySQL, SQLite | PostgreSQL, MongoDB | PostgreSQL, MySQL |
| Complexity | Medium-High | Medium | Low |
Choosing the Right Tool
Grafana OnCall — Best overall choice if you already use Grafana or want the most feature-complete PagerDuty replacement. Excellent for teams that need proper on-call schedules, escalation policies, and mobile push notifications.
Alerta — Best for high-volume alert environments that need powerful deduplication and correlation. Ideal if you run multiple monitoring systems and need to consolidate thousands of alerts into actionable incidents.
OpenDuty — Best for simple setups that need basic on-call rotation without complexity. Good for small teams or homelab users who want lightweight PagerDuty-like functionality with minimal infrastructure.
Grafana OnCall: Complete Self-Hosted Setup Guide
Grafana OnCall is the most feature-rich open-source incident management platform. It provides on-call schedules, escalation chains, alert grouping, and direct integration with the Grafana ecosystem. It is the closest open-source equivalent to PagerDuty available today.
Architecture Overview
Grafana OnCall consists of three main components:
- Engine — Django-based backend that handles schedules, escalations, and alert processing
- Celery Workers — Async task processing for notifications, webhooks, and scheduled jobs
- UI — React frontend (can run standalone or embedded in Grafana)
Docker Compose Deployment
Create a directory for the deployment:
| |
Create the docker-compose.yml file:
| |
Generate a secure secret key:
| |
Replace replace-with-your-secure-random-key in the compose file with the output, then start the stack:
| |
The engine will be available at http://localhost:8080. The default admin credentials are created on first run — check the container logs for the initial setup instructions.
Configuring On-Call Schedules
After logging in, navigate to Schedules to create your first on-call rotation:
- Click Create Schedule
- Choose rotation type: Rolling (fixed shifts that cycle) or Custom
- Set shift length (e.g., 24 hours for daily rotation, 168 hours for weekly)
- Add team members to the rotation
- Set the rotation start date and time
You can create multiple layers — for example, a primary on-call layer and a secondary backup layer that only gets notified if the primary does not acknowledge within a specified window.
Setting Up Escalation Chains
Escalation chains define what happens when an alert fires:
- Go to Escalation Chains → Create
- Name the chain (e.g., “Production Critical”)
- Add escalation steps in order:
- Notify user via push — sends mobile notification to the on-call person
- Wait 15 minutes — give them time to acknowledge
- Notify user via SMS — escalate to phone if no acknowledgment
- Notify another user — escalate to a different team member
- Repeat — loop back to the first step for continuous alerting
Integrating Prometheus Alerts
Grafana OnCall has native Prometheus integration. Add this to your Prometheus alertmanager.yml:
| |
The integration key is generated within Grafana OnCall under Integrations → Prometheus. Once configured, every Prometheus alert automatically creates an incident in OnCall, respects escalation policies, and notifies the correct on-call engineer.
Alerta: Complete Self-Hosted Setup Guide
Alerta specializes in high-volume alert consolidation and deduplication. If your monitoring stack generates thousands of alerts per hour from Prometheus, Nagios, Zabbix, and custom scripts, Alerta can reduce noise by correlating related alerts into single incidents.
Architecture Overview
Alerta consists of:
- API Server — Flask-based REST API that receives, processes, and stores alerts
- Web UI — Modern single-page application for alert management
- Database — PostgreSQL (recommended) or MongoDB for alert storage
- Plugins — Extensible plugin system for enrichment, notification, and transformation
Docker Compose Deployment
| |
Create docker-compose.yml:
| |
Start the deployment:
| |
The web UI is available at http://localhost:8081. Log in with the admin email configured above. You will be prompted to set a password on first login.
Alert Deduplication and Correlation
Alerta automatically deduplicates alerts based on resource, event, and environment fields. When the same alert fires repeatedly, Alerta increments a duplicateCount counter and updates the lastReceiveTime instead of creating a new incident.
To configure deduplication rules, create a plugin file at /etc/alerta/plugins/custom.py:
| |
Configuring Notification Rules
Alerta sends notifications through its plugin system. To set up email notifications, create /etc/alerta/alerta.conf:
| |
For Slack integration, install the Slack plugin:
| |
Then add to the PLUGINS environment variable:
| |
Integrating Multiple Alert Sources
Alerta accepts alerts from virtually any source via its REST API. Here is how to send an alert using curl:
| |
You can also integrate with Prometheus Alertmanager by adding this to alertmanager.yml:
| |
OpenDuty: Lightweight Self-Hosted On-Call Setup
OpenDuty is the simplest of the three options — a Django-based application that provides basic on-call rotation and PagerDuty-compatible escalation. It is ideal for small teams that need straightforward scheduling without the complexity of Grafana OnCall or Alerta.
Docker Compose Deployment
| |
Create docker-compose.yml:
| |
Start the stack:
| |
Configuring On-Call Schedules
OpenDuty provides a straightforward schedule interface:
- Create a Service (e.g., “Production Infrastructure”)
- Define a Schedule with rotation rules (daily, weekly)
- Add Users to the rotation
- Configure Escalation Policies — define how long to wait before escalating and who to escalate to
The interface is less polished than Grafana OnCall, but it covers the essential functionality: who is on call, when they rotate, and who gets notified if the primary does not respond.
PagerDuty-Compatible Webhook
One of OpenDuty’s strengths is its PagerDuty-compatible API. If you have existing scripts or tools that integrate with PagerDuty, you can often point them at OpenDuty with minimal changes:
| |
Reverse Proxy and HTTPS Configuration
All three platforms should run behind a reverse proxy with TLS termination. Here is a Caddy configuration that works for any of them:
| |
Alternatively, using Nginx:
| |
Migrating from PagerDuty
If you are replacing PagerDuty with a self-hosted solution, here is a practical migration path:
Phase 1: Parallel Running (Week 1-2)
Set up your self-hosted instance alongside PagerDuty. Forward alerts to both systems simultaneously to verify that incidents are created correctly and escalation policies work as expected.
Phase 2: Schedule Migration (Week 3)
Recreate your on-call schedules in the new system. Export your PagerDuty schedule data via their API and import it into your chosen platform. Grafana OnCall and Alerta both support API-based schedule creation.
Phase 3: Integration Migration (Week 4)
Update your monitoring tools to send alerts to the new system instead of PagerDuty:
- Prometheus Alertmanager — update webhook URLs
- Zabbix — change media type to point to the new API
- Custom scripts — replace PagerDuty API calls with your new endpoint
- Third-party services — update webhook destinations
Phase 4: PagerDuty Decommission (Week 5)
After running in parallel for a full on-call cycle and verifying that all integrations work, decommission PagerDuty. Keep backups of historical incident data before closing the account.
Monitoring Your Incident Management System
It is ironic (but essential) to monitor your monitoring system. Add health checks for your incident management platform:
| |
Final Recommendation
For most teams in 2026, Grafana OnCall is the best self-hosted incident management platform. It offers the most complete feature set — on-call schedules, escalation policies, mobile apps, and native Grafana integration — under the permissive Apache 2.0 license. If you already run Grafana for dashboards, OnCall integrates seamlessly as a native panel.
Choose Alerta if your primary challenge is alert noise and deduplication. Its correlation engine is unmatched among open-source options, and the plugin architecture lets you build custom alert processing pipelines.
Choose OpenDuty if you need something simple and lightweight — a basic on-call rotation system for a small team that does not justify the infrastructure overhead of the other two options.
All three platforms eliminate the per-user pricing model that makes PagerDuty expensive. All three keep your incident data on your own servers. And all three can replace PagerDuty entirely with proper configuration and migration planning.
Frequently Asked Questions (FAQ)
Which one should I choose in 2026?
The best choice depends on your specific requirements:
- For beginners: Start with the simplest option that covers your core use case
- For production: Choose the solution with the most active community and documentation
- For teams: Look for collaboration features and user management
- For privacy: Prefer fully open-source, self-hosted options with no telemetry
Refer to the comparison table above for detailed feature breakdowns.
Can I migrate between these tools?
Most tools support data import/export. Always:
- Backup your current data
- Test the migration on a staging environment
- Check official migration guides in the documentation
Are there free versions available?
All tools in this guide offer free, open-source editions. Some also provide paid plans with additional features, priority support, or managed hosting.
How do I get started?
- Review the comparison table to identify your requirements
- Visit the official documentation (links provided above)
- Start with a Docker Compose setup for easy testing
- Join the community forums for troubleshooting