Introduction
When deploying Internet of Things (IoT) sensors, industrial telemetry systems, or large-scale monitoring infrastructure, you need a database purpose-built for time-stamped data. Traditional relational databases struggle with the write throughput and query patterns of time-series workloads — millions of data points per second, range queries spanning months of data, and automatic downsampling for long-term storage.
Two battle-tested open-source time-series databases have been serving production workloads for over a decade: KairosDB and OpenTSDB. Both are designed for high-ingest, scalable time-series storage, but they take fundamentally different architectural approaches.
KairosDB started as a fork of OpenTSDB in 2013, replacing the HBase backend with Apache Cassandra for better write scalability and operational simplicity. OpenTSDB, originally developed at StumbleUpon and later maintained by the open-source community, remains one of the most widely deployed time-series databases in enterprise environments with its mature HBase-based architecture.
In this guide, we compare KairosDB and OpenTSDB for self-hosted time-series workloads, covering architecture, deployment, query capabilities, and operational considerations.
Architecture Comparison
KairosDB
KairosDB uses Apache Cassandra as its primary storage backend, leveraging Cassandra’s distributed architecture for horizontal scalability. It can also use H2 for single-node deployments. Key architectural features include:
- Cassandra-backed storage: Automatic data distribution across nodes with tunable consistency
- REST API: Full HTTP API for data ingestion and querying
- Tag-based data model: Each data point is identified by a metric name, tags (key-value pairs), timestamp, and value
- Built-in aggregators: Min, max, sum, average, standard deviation, percentile calculations
- Roll-up and downsampling: Automatic data aggregation for long-term retention
- Plugin framework: Custom data point listeners for integration with external systems
OpenTSDB
OpenTSDB uses Apache HBase as its storage backend, which runs on top of HDFS. This ties it to the Hadoop ecosystem but provides proven scalability for petabyte-scale deployments:
- HBase on HDFS: Row-key design optimizes time-range scans
- Telnet-style and HTTP APIs: Dual ingestion interfaces
- UID table compression: Maps metric names and tag values to compact unique identifiers
- Built-in downsampling: Configurable roll-up policies for data retention
- Expression-based querying: Complex time-series arithmetic and grouping
- Annotation support: Attach metadata events to specific timestamps
Feature Comparison
| Feature | KairosDB | OpenTSDB |
|---|---|---|
| Storage Backend | Apache Cassandra / H2 | Apache HBase (HDFS) |
| GitHub Stars | 1,760+ | 5,070+ |
| Last Updated | March 2026 | December 2024 |
| Data Model | Metric + Tags + Timestamp + Value | Metric + Tags + Timestamp + Value |
| API | REST (HTTP/JSON) | HTTP + Telnet |
| Query Language | JSON-based query DSL | TSD query expressions |
| Downsampling | Built-in roll-ups | Built-in aggregators |
| Clustering | Via Cassandra ring | Via HBase regions |
| Aggregation | Min, Max, Sum, Avg, StdDev, Percentile | Min, Max, Sum, Avg, Rate, Percentile |
| Authentication | Basic auth + API keys | None (relies on network security) |
| Monitoring Integration | Graphite protocol, direct REST | Grafana datasource, TCollector |
| License | Apache 2.0 | LGPLv2.1+ / GPLv3+ |
Docker Compose Deployment
KairosDB with Cassandra
| |
OpenTSDB with HBase
| |
Quick Ingestion Test
Once deployed, test data ingestion for KairosDB:
| |
For OpenTSDB, use the telnet-style API:
| |
Why Self-Host Your Time-Series Database?
Self-hosting a time-series database gives you complete control over your telemetry and IoT data pipeline. Cloud-hosted time-series services like AWS Timestream, Google Cloud Monitoring, or InfluxDB Cloud charge per data point ingested, per query executed, and per GB stored — costs that scale linearly with your sensor fleet. A modest deployment of 500 IoT sensors generating one data point per second can easily exceed $1,000/month in cloud costs.
Data sovereignty is equally critical for industrial and infrastructure monitoring. When your time-series data represents factory floor operations, building management systems, or power grid telemetry, sending every data point to a third-party cloud introduces compliance risks and latency. Self-hosted KairosDB or OpenTSDB keeps all data on-premises, under your security controls, with sub-millisecond query latency.
Vendor independence is the third pillar. With open-source time-series databases, you are not locked into a specific cloud provider’s API, pricing model, or deprecation schedule. Your data lives in standard formats (Cassandra SSTables or HBase HFiles), portable across any infrastructure — bare metal, VMs, or Kubernetes clusters. If you are already running monitoring infrastructure like Prometheus, check out our guide on self-hosted Prometheus long-term storage.
For infrastructure monitoring dashboards, pair your time-series database with a visualization layer — see our self-hosted infrastructure monitoring comparison for the full stack. If you need a lighter-weight time-series solution with a built-in query language, our self-hosted time-series database comparison covers GreptimeDB, InfluxDB, and VictoriaMetrics.
Operational Considerations for Production Deployments
Running a time-series database in production requires more than just starting containers. Here are the operational aspects you need to plan for:
Backup and Recovery
KairosDB stores data in Cassandra, which supports snapshot-based backups via nodetool snapshot. You can automate daily snapshots with a cron job and ship them to offsite storage. OpenTSDB data lives in HBase, which uses HDFS snapshots. Both approaches require testing your restore procedure — a backup you have not restored is not a backup. Budget at least 4 hours to validate your restore workflow before going to production.
Monitoring the Database Itself
Your time-series database should be monitored like any other infrastructure component. For KairosDB, expose JMX metrics to Prometheus using the JMX exporter. For OpenTSDB, enable the built-in stats endpoint at /api/stats and scrape it with your monitoring stack. Track write latency (P99 should stay under 100ms for most workloads), compaction queue depth, and disk usage growth rate. Set alerts for when storage exceeds 80% capacity.
Capacity Planning
Estimate your storage needs using this formula: daily_ingestion = data_points_per_second × seconds_per_day × bytes_per_datapoint. KairosDB with Cassandra compression typically uses 12-20 bytes per data point. OpenTSDB with HBase compression uses 8-15 bytes per data point. A modest IoT deployment of 10,000 data points per second generates approximately 10-17 GB of compressed data per day. Plan your retention policy accordingly — keep high-resolution data for 30 days, 5-minute roll-ups for 6 months, and hourly roll-ups indefinitely.
Security Hardening
Neither KairosDB nor OpenTSDB includes built-in authentication beyond basic HTTP auth. In production, always place your time-series database behind a reverse proxy with TLS termination and IP allowlisting. For KairosDB, use Nginx with auth_basic and proxy_pass. For OpenTSDB, restrict the telnet API port (4242) to localhost only and expose only the HTTP API through an authenticated proxy. If your deployment spans multiple data centers, use WireGuard or Tailscale to encrypt inter-node traffic between Cassandra or HBase nodes.
Choosing Between KairosDB and OpenTSDB
Choose KairosDB if:
- You already run Cassandra or prefer its operational model over HBase
- You want a simpler deployment with fewer moving parts
- You need a pure REST/JSON API for modern toolchain integration
- Your team has more experience with CQL than HBase
Choose OpenTSDB if:
- You are already invested in the Hadoop/HBase ecosystem
- You need the maturity of a 12+ year production-tested codebase
- You require the telnet-style API for legacy collector compatibility
- You are deploying at petabyte scale with existing HDFS infrastructure
Both tools are mature, proven at scale, and well-suited for self-hosted IoT and telemetry workloads. Your choice should align with your existing infrastructure stack — Cassandra vs HBase is often the deciding factor.
FAQ
Can KairosDB and OpenTSDB handle millions of data points per second?
Yes, both are designed for high-throughput ingestion. KairosDB scales horizontally by adding more Cassandra nodes — each node can handle ~50,000 writes/second, so a 10-node cluster reaches 500,000 writes/second. OpenTSDB scales with HBase regions and can reach similar throughput on adequately provisioned hardware. For extreme write loads, consider pre-splitting HBase regions or configuring Cassandra’s write path for optimal performance.
Do I need to run a full Hadoop cluster for OpenTSDB?
OpenTSDB requires HBase, which requires HDFS and ZooKeeper. For production deployments, this means at least 5 nodes (3 ZooKeeper, 2+ HBase RegionServers). However, for development and small-scale deployments, the Docker Compose configuration above runs a single-node HBase instance sufficient for testing and low-volume production use (up to ~10,000 data points/second).
How do these compare to InfluxDB or TimescaleDB?
InfluxDB and TimescaleDB are more modern time-series databases with built-in SQL-like query languages (InfluxQL/Flux and full PostgreSQL SQL, respectively). They are generally easier to deploy and operate than KairosDB or OpenTSDB. However, KairosDB and OpenTSDB excel in environments already running Cassandra or HBase, where adding a time-series capability without introducing a new database system is the primary requirement. See our time-series database comparison guide for alternatives.
Is OpenTSDB still actively maintained?
The main OpenTSDB repository saw its last commit in December 2024, indicating the project is in maintenance mode. The 5,070+ GitHub stars and thousands of production deployments mean the codebase is stable and battle-tested, but new features are unlikely. KairosDB, with its most recent update in March 2026, has more active development.
Can I migrate data between KairosDB and OpenTSDB?
There is no built-in migration tool between the two, as they use fundamentally different storage backends (Cassandra vs HBase). Migration requires exporting data via each system’s API, transforming to the target format, and re-importing. For large datasets, consider running both systems in parallel during a transition period rather than attempting a bulk migration.
💰 想测试你的市场判断力?我用 Polymarket 做预测市场交易——这是全球最大的预测市场平台,从大选结果到技术监管时间线,什么都可以押注。和赌博不同,这是真正的信息市场:你懂的信息越多,胜率越高。我靠预测技术相关事件的走向已经赚了不少。用我的邀请链接注册:Polymarket.com