How Companies Are Using InfluxDB and Kafka in Production

Navigate to:

This article was originally published in The New Stack.

Hulu, the entertainment streaming platform, needed a solution to scale up its internal application and infrastructure monitoring platform as it grew beyond 1 million metrics per second.

The solution it created combines two open source tools — InfluxDB, a time series database, and Kafka, an event-streaming platform.

It’s not just global enterprises like Hulu that have access to world-class tools and infrastructure to achieve their business goals. Even startups can acquire the right tools “off the shelf,” rather than creating them in-house, wasting developers’ time and resources.

For many companies, success lies in knowing how to make the most of these tools to solve their teams’ stickiest problems. In the leadup to Kafka Summit 2022, it’s worth exploring how two particular open source tools work together: Kafka and InfluxDB.

In the following article, you’ll learn a bit about both projects and then get some real-world examples of how major companies are using these tools in production to solve problems.

What Is InfluxDB?

As mentioned previously, InfluxDB is an open source time series database, designed to work with time series data (also known as “time-stamped data”). It is optimized for handling massive volumes of data being written and provides the ability to query that data in real time, whereas a general purpose database would struggle at similar scale due to design tradeoffs such as how data is compressed and indexed for queries involving data within a certain time range that’s needed for analysis.

Beyond its performance advantages, InfluxDB provides a number of developer experience benefits to make common time series workloads easier to implement. This means built-in features for things like downsampling data, creating custom alerts, and the Flux query language designed explicitly for working with time series data.

What is Kafka?

Kafka, the event-streaming platform, allows users to publish and subscribe to events with their applications. What separates Kafka from tools that provide similar functionality is the built-in scalability, fault tolerance, and other usability features that abstract away complexity to make things easier for developers.

Kafka was originally developed by LinkedIn for tracking user activity events in real time across LinkedIn. After being open sourced, Kafka started being used for a broad range of use cases like log aggregation, stream processing, metrics monitoring, and as a message broker for distributed systems.

How companies use InfluxDB and Kafka together

InfluxDB and Kafka have become a popular combination because of the need for a data store that can scale alongside Kafka. They can be seen as complementary tools, with Kafka handling many of an organization’s real-time processing needs while InfluxDB can be used for long-term analytics queries, or to combine real-time data with historical data to give more context when needed.

As a result a number of tools have been created to make integrating InfluxDB and Kafka easier. Confluent has created a connector that allows InfluxDB to be used as a data store, as well as a source of events that can be pushed into Kafka.

The Telegraf metrics collector agent also has a dedicated Kafka plugin which can be used to pull in messages from specified Kafka topics and store them in InfluxDB. You can also use some of Telegraf’s other processor plugins to transform or filter the data before storage.

Here are a few examples of how companies are using Kafka and InfluxDB together:

Hulu

The solution Hulu’s teams created to help it scale its internal monitoring solution uses InfluxDB as the storage layer, with Kafka being used locally in each data center to hold metrics in the event of local outages. Once issues are resolved, the data persisted by Kafka can be written to any InfluxDB cluster that is out of sync with clusters in other data centers.

CERN

CERN is a research organization that operates the largest particle physics lab in the world — including the world’s largest and highest-energy particle collider, the Large Hadron Collider. To store data from its ALICE experiment — in which CERN scientists are looking for how quarks and gluons interact under conditions similar to the Big Bang — the organization uses InfluxDB.

The ALICE experiments involve monitoring how atoms interact with each other at extreme energy densities. ALICE produces raw data at a rate of 3.4TBs per second. This data is compressed; then the metrics are aggregated and stored with InfluxDB. Kafka is used as part of the stream-processing pipeline to aggregate those metrics as well as send the raw data to be archived.

Robinhood

Robinhood, the online financial-services company, uses Kafka and InfluxDB to power its anomaly detection platform. Kafka is used to send data to InfluxDB via Telegraf, where the data is aggregated and queried to create predictions and compare those to actual observed values. The results of these predictions are sent back to Kafka, where other services can listen for messages and act on those predictions.

Wrapping up

InfluxDB and Kafka are both very versatile tools that complement each other well in any application architecture. Thanks to being open source, both projects have a strong ecosystem of tools and libraries that enhance their value beyond what the core project provides as well.

To learn more about Kafka and InfluxDB, check out our resource page.