Benchmarking InfluxDB vs. Elasticsearch for Time Series: Why Time Series Databases are Better for Metrics
In this technical paper, we’ll compare the performance and features of InfluxDB 1.7.2 vs. Elasticsearch 6.5.0 for common time series workloads, specifically looking at the rates of data ingestion, on-disk data compression, and query performance. This data should prove valuable to developers and architects evaluating the suitability of these technologies for their use case.
InfluxDB is an open source time series database written in Go. At its core is a custom-built storage engine called the Time-Structured Merge (TSM) Tree, which is optimized for time series data. Controlled by a custom SQL-like query language named InfluxQL, InfluxDB provides out-of-the-box support for mathematical and statistical functions across time ranges and is perfect for custom monitoring and metrics collection, real-time analytics, plus IoT and sensor data workloads.
Elasticsearch is an open source search server written in Java and built on top of Apache Lucene. It provides a distributed, full-text search engine suitable for enterprise workloads. While not a time series database per se, Elasticsearch employs Lucene’s column indexes, which are used to efficiently aggregate numeric values. Combined with query-time aggregations and the ability to index on timestamp fields (which is also important for storing and retrieving log data), Elasticsearch provides the primitives for storing and querying time series data.
Our goal with this benchmarking test was to create a consistent, up-to-date comparison that reflects the latest developments in both InfluxDB and Elasticsearch. Periodically, we’ll re-run these benchmarks and update this document with our findings. All of the code for these benchmarks is available on GitHub. Feel free to open up issues or pull requests on that repository or if you have any questions, comments, or suggestions.