Category Archives: InfluxDB

Assessing Write Performance of InfluxDB’s Clusters w/ AWS


While conducting the various benchmark tests against InfluxData, we decided to also explore the aspects of scaling clusters of InfluxDB with our closed-source InfluxEnterprise product, primarily through the lens of write performance.

This data should prove valuable to developers and architects evaluating the suitability of InfluxEnterprise for their use case, in addition to helping establish some rough guidelines for what those users should expect in terms of write performance in a real-world environment.

To read the complete details of the benchmarks and methodology, download the “Assessing Write Performance of InfluxDB’s Clusters w/ AWS” technical paper or watch the recorded video titled: “How cluster creation and differences impact performance.”

Continue reading Assessing Write Performance of InfluxDB’s Clusters w/ AWS

InfluxDB Markedly Outperforms OpenTSDB in Time-Series Data & Metrics Benchmark


This is the an update in a series of detailed benchmarking tests comparing InfluxDB vs other databases for time-series data and metrics workloads. Previously, we have completed benchmarking tests comparing InfluxDB vs Elasticsearch, Cassandra, and MongoDB.

At InfluxData, one of the common questions we’ve been getting asked by developers and architects alike the last few months is, “How does InfluxDB compare to OpenTSDB for time-series workloads?” This question might be prompted for a few reasons. First, if they’re starting a brand new project and doing the due diligence of evaluating a few solutions head-to-head, it can be helpful in creating their comparison grid. Second, they might already be using OpenTSDB for ingesting logs in an existing monitoring setup, but would like to now see how they can integrate metrics collection into their system and believe there might be a better solution than OpenTSDB for this task. Continue reading InfluxDB Markedly Outperforms OpenTSDB in Time-Series Data & Metrics Benchmark

InfluxDB 1.1 released with up to 60% performance increase and new query functionality


Roughly 2 months after the release of InfluxDB 1.0, we are already releasing v1.1 that includes a number of key performance and stability improvements as well as some new query capabilities. This highlights our commitment to iterate quickly based on the feedback of the community.

Performance Improvements

There are many changes throughout the code that reduce memory allocations.  Reducing memory allocations alleviates pressure on the the Go garbage collector and improves performance throughout the system.  The majority of changes occurred in the write, query and compaction code paths.  These changes should help reduce RSS and Heap usage as well as CPU utilization for some users.

Continue reading InfluxDB 1.1 released with up to 60% performance increase and new query functionality

Getting to 1M values per second on an InfluxDB cluster


In distributed databases it’s a common benchmark goal to get to 1 million writes per second. With InfluxDB, we’ve been optimizing towards that goal over the last year as we built a storage engine from scratch, had multiple releases of clustering in our InfluxCloud and InfluxEnterprise offerings and performed optimizations specific to our implementation language, Go. Over this time we’ve been testing clusters and improving performance and stability along the way. With the upcoming release of InfluxDB 1.1, we’ve achieved writes greater than 1 million values per second on very modest configurations. In this post we’ll look at some of the configurations and give pointers to our test code.

Continue reading Getting to 1M values per second on an InfluxDB cluster

Announcing Multi-tenant Grafana support for InfluxCloud


Due to the great feedback from our customers, we are excited to announce that we are adding dedicated instances of Grafana to InfluxCloud. We know that the data our customers collect about their applications and services is powerful, and that allowing configurable access to different groups and users in their organization to view this information with their own dashboards extends this power.

Multi-tenant support for Grafana is available to all InfluxCloud customers, regardless of their plan, at only $200/month.

Continue reading Announcing Multi-tenant Grafana support for InfluxCloud

Monitoring and alerting with Kapacitor now available on InfluxCloud


Today we’ve made Kapacitor, the InfluxData project for monitoring and alerting on time series data, available on our AWS backed InfluxCloud offering. Existing and new InfluxCloud customers can now add a fully managed instance of Kapacitor starting at $200 per month.

Using Kapacitor’s API, users can create and enable TICKscripts on our cloud. Here’s an example that will send an alert to Slack if CPU utilization is > 95% for more than two minutes. It performs this check every 10 seconds.

```
stream
    |from()
        .measurement('cpu')
    |window()
        .period(2m)
        .every(10s)
    |alert()
        .crit(lambda: "value" > 95)
        // Only alert if all points in the window match the criteria.
        .all()
        .slack()
        .channel('#alerts')
```

Alerts can be configured based on moving averages, outliers, missing data (known as a dead man’s switch) and many other criteria. See the Kapacitor documentation for more examples and details on how it works.

Continue reading Monitoring and alerting with Kapacitor now available on InfluxCloud

InfluxDB is 27x Faster vs MongoDB for Time-Series Workloads


This is the third in a series of detailed benchmarking tests comparing InfluxDB vs Elasticsearch, Cassandra, MongoDB and other databases for time-series data and metrics workloads.

At InfluxData, one of the common questions we’ve been getting asked by developers and architects alike the last few months is, “How does InfluxDB compare to MongoDB for time-series workloads?” This question might be prompted for a few reasons. First, if they’re starting a brand new project and doing the due diligence of evaluating a few solutions head-to-head, it can be helpful in creating their comparison grid. Second, they might already be using MongoDB for ingesting data in an existing application, but would like to now see how they can integrate metrics collection into their system and believe there might be a better solution than MongoDB for this task.

Over the last few weeks a few members of the InfluxData engineering and QA teams set out to compare the performance and features of InfluxDB and MongoDB for common time-series workloads, specifically looking at the rates of data ingestion, on-disk data compression, and query performance. InfluxDB outperformed MongoDB in all three tests with 27x greater write throughput, while using 84x less disk space, and delivering relatively equal performance when it came to query speed.

To read the complete details of the benchmarks and methodology, download the “Benchmarking InfluxDB vs. MongoDB for Time-Series Data & Metrics Management” technical paper.

Our overriding goal was to create a consistent, up-to-date comparison that reflects the latest developments in both InfluxDB and MongoDB with later coverage of other databases and time-series solutions. We will periodically re-run these benchmarks and update our detailed technical paper with our findings. All of the code for these benchmarks are available on Github. Feel free to open up issues or pull requests on that repository or if you have any questions, comments, or suggestions.

Now, let’s take a look at the results…

Versions Tested

InfluxDB v1.0.0

InfluxDB is an open-source time-series database written in Go. At its core is a custom-built storage engine called the Time-Structured Merge (TSM) Tree, which is optimized for time-series data. Controlled by a custom SQL-like query language named InfluxQL, InfluxDB provides out-of-the-box support for mathematical and statistical functions across time ranges and is perfect for custom monitoring and metrics collection, real-time analytics, plus IoT and sensor data workloads.

MongoDB v3.3.11

MongoDB is an open-source, document-oriented database, colloquially known as a NoSQL database, written in C and C++. Though it’s not generally considered a true time series database per se, its creators often promote its use for time-series workloads. It offers modeling primitives in the form of timestamps and bucketing, which give users the ability to store and query time series data.

About the Benchmarks

In building a representative benchmark suite, we identified the most commonly evaluated characteristics for working with time-series data. We looked at performance across three vectors:

  • Data ingest performance – measured in values per second
  • On-disk storage requirements – measured in MBs
  • Mean query response time – measured in milliseconds

About the Dataset

For this benchmark, we focused on a dataset that models a common DevOps monitoring and metrics use case, where a fleet of servers are periodically reporting system and application metrics at a regular time interval. We sampled 100 values across 9 subsystems (CPU, memory, disk, disk I/O, kernel, network, Redis, PostgreSQL, and Nginx) every 10 seconds. For the key comparisons, we looked at a dataset that represents 100 servers over a 6-hour period, which represents a relatively modest deployment.

  • Number of Servers: 1000
  • Values measured per Server: 100
  • Measurement Interval: 10s
  • Dataset duration(s): 6h
  • Total values in dataset: 216,000,000

This is only a subset of the entire benchmark suite, but it’s a representative example. If you’re interested in additional detail, you can read more about the testing methodology on GitHub.

Write Performance

InfluxDB outperformed MongoDB by 27x when it came to data ingestion.

mongo_write

On-Disk Compression

InfluxDB outperformed MongoDB by delivering 84x better compression.

mongodb_disk

Query Performance

InfluxDB and MongoDB had relatively equal performance characteristics when it came to query speed.

mongodb_query

Summary

The benchmarking tests and resulting data demonstrated that InfluxDB outperformed MongoDB in data ingestion and on-disk storage by a significant margin. Specifically:

  • InfluxDB outperformed MongoDB by 27x when it came to data ingestion
  • InfluxDB outperformed MongoDB by delivering 84x better compression
  • InfluxDB and MongoDB performed similarly on query response time as concurrency increased.

It’s also important to note that configuring MongoDB to work with time series data wasn’t trivial. It requires up-front decisions about how to structure your collections and data types, which can be very time consuming and will have long-lasting impacts on how you can interact with your data and what types of queries you can run. InfluxDB, on the other hand, is ready to use for time series workloads out-of-the-box with no additional configuration.

In conclusion, we highly encourage developers and architects to run these benchmarks themselves to independently verify the results on their hardware and data sets of choice. However, for those looking for a valid starting point on which technology will give better time-series data ingestion, compression and query performance “out-of-the-box”, InfluxDB is the clear winner across many dimensions, especially when the data sets become larger and the system runs over a longer period of time.

What’s next

  • Download: 1.0 GA downloads for the TICK-stack are live on our “downloads” page
  • Deploy on the Cloud: Get started with a FREE trial of InfluxCloud featuring fully-managed clusters, Kapacitor and Grafana.
  • Deploy on Your Servers: Want to run InfluxDB clusters on your servers? Try a FREE 14-day trial of InfluxEnterprise featuring an intuitive UI for deploying, monitoring and rebalancing clusters, plus managing backups and restores.
  • Tell Your Story: Over 100 companies have shared their story on how InfluxDB is helping them succeed. Submit your testimonial and get a limited edition InfluxDB hoodie as a thank you.

InfluxDB 1.0 GA Released: A Retrospective and What’s Next


Today we’re excited to announce the 1.0 release of the open source time series database, InfluxDB and our commercial offering, InfluxEnterprise, which supports high availability deployments and scale out clustering for increased throughput. This makes today the biggest day in our company’s history. This release has been almost 3 years in the making and on this occasion I’d like to take a look back at the project’s history, let users know what compatibility guarantees the 1.x line of releases will have and talk about what’s next for InfluxDB.

As we announce the release today, there are tens of thousands of organizations around the globe using InfluxDB. They’re using it to monitor their network infrastructure, security, container infrastructure, solar panels, agriculture, scientific experiments, user analytics, business intelligence, home automation, and countless other specific use cases. To learn more about how companies big and small are using InfluxDB to manage their time-series data, checkout our testimonials page which currently has over 100 companies listed. What’s your story?

Why a time-series database?

In September of 2013 Todd Persen, John Shahid, and I were working on Errplane, a SaaS application for doing real-time metrics and monitoring. Todd and I had started the company in 2012 and despite getting a leg up from taking it through the Winter 2013 batch of Y Combinator, it wasn’t working as we’d hoped. We had raised a modest seed round of funding so we weren’t in danger of imminent demise, but we weren’t having success gaining real customer traction. Continue reading InfluxDB 1.0 GA Released: A Retrospective and What’s Next

Announcing InfluxDB, Telegraf, Kapacitor and Enterprise 1.0 RC1


InfluxData team is excited to announce the 1.0 release candidate (RC1) of the Telegraf, InfluxDB, Chronograf, Kapacitor projects and InfluxEnterprise, our clustering and monitoring product. These projects have been years in the making, so are pleased to have finalized the API and all the features for the 1.0 release. From now until the 1.0 final release we’ll only merge bug fixes into the 1.0 branch.

Robust & Stable APIs

We’ve already done extensive testing on both the single server and clustered versions of InfluxDB, so these releases should be considered very stable. We’ll be rolling out the 1.0 RC to all of our customers in InfluxCloud over the next week or two.

Purpose Built for Time-Series

InfluxDB 1.0 will be shipping with our purpose built storage engine for time series data, the Time Structured Merge tree. It gives incredible write performance while delivering compression better than generalized solutions. Depending on the shape of your data, each individual value, timestamp pair can take as little as 2 bytes, including all the tag and measurement metadata. Continuous queries and retention policies give users the ability to have the database manage the data lifecycle automatically by aggregating data for longer term trends while dropping older high precision data.

Continue reading Announcing InfluxDB, Telegraf, Kapacitor and Enterprise 1.0 RC1