Get more from your time series database: OpenTSDB vs InfluxDB
As the number of metrics collected and acted on increases, developers need a solution that is fast and efficient to keep up with the demands of their solutions.
In this technical paper, we’ll compare key performance metrics for InfluxDB and OpenTSDB for common time series workloads, specifically looking at the rates of data ingestion, on-disk data compression, and query performance. We’ll also look at a feature comparison and the resulting time required to build a complete time series solution with each tool.
InfluxDB is a noSQL open source time series database built by InfluxData and written in Go. At its core is a custom-built storage engine called the Time-Structured Merge (TSM) Tree, which is optimized for time series data. Controlled by a custom SQL-like query language named InfluxQL, InfluxDB provides out-of-the-box support for mathematical and statistical functions across time ranges and is perfect for custom monitoring and metrics collection, real-time analytics, plus IoT and sensor data workloads.
OpenTSDB is a scalable, distributed time series database written in Java and built on top of HBase. It was originally authored by Benoît Sigoure at StumbleUpon beginning in 2010 and open-sourced under LGPL. As opposed to InfluxDB, OpenTSDB is a time series database that depends on 3rd party standalone products. It relies upon HBase as its data storage layer, so the OpenTSDB Time Series Daemons (TSDs in OpenTSDB parlance) effectively provide the functionality of a query engine with no shared state between instances. This can require a significant amount of additional operational cost and overhead to manage in a production deployment.
Our goal with this benchmarking test was to create a consistent, up-to-date comparison that reflects the latest developments in both InfluxDB and OpenTSDB. Periodically, we’ll re-run these benchmarks and update this document with our findings. All of the code for these benchmarks is available on GitHub. Feel free to open up issues or pull requests on that repository or if you have any questions, comments, or suggestions.
This comparison should prove valuable to developers and architects evaluating the suitability of these technologies for their use case, especially those building DevOps monitoring (infrastructure monitoring, application monitoring, cloud monitoring), IoT monitoring, and real-time analytics applications.