Flux Join Tutorial - Enrich Time Series Data with Data from PostgreSQL

In this tutorial you will learn how to use the Flux query language to enrich time series data stored in InfluxDB by combining it with metadata stored in a relational database.

Tutorial requirements

To follow this tutorial you will need a few things. The first is a running instance of InfluxDB, either Cloud or a local open source version. You will need to create a bucket in your InfluxDB instance and store the air sensor sample dataset in that bucket. You will also need to download the paired relational SQL dataset containing sensor metadata that is on the same page and then store that data in your preferred relational database. Flux natively provides support for many of the most popular databases like:

  • PostgreSQL
  • MySQL
  • Amazon RDS
  • Microsoft SQL Server
  • SQLite
  • Google BigQuery

Check out the documentation for the full list of supported external data stores.

InfluxDB - Supported external datastores

Using Flux to query PostgreSQL

Here’s a basic example of how you could join together relation and time series data using the air sensor sample data provided by InfluxDB:

import "sql"

//postgres query
sensorInfo = sql.from(
  driverName: "postgres",
  dataSourceName: "postgresql://username:[email protected]:5432",
  query: "SELECT * FROM table_name"

//  InfluxDB query
sensorMetrics = from(bucket: "example-bucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "airSensors")

// Join results from previous two queries
join(tables: {metric: sensorMetrics, info: sensorInfo}, on: ["sensor_id"])

The result of this query is that the data will be combined into a single table with rows based on the sensor ID of each dataset.

When working with databases or any other resource that requires authentication credentials, It is a good idea to take advantage of InfluxDB Secrets rather than using plaintext in your Flux query. This will also make it easier to reuse credentials across multiple Flux scripts and obviously reduce your chances of major security issues. Here’s an example of how the Postgres query would look using secrets and importing the credentials into your script:

import "influxdata/influxdb/secrets"
import "sql"

username = secrets.get(key: "POSTGRES_USERNAME")
password = secrets.get(key: "POSTGRES_PASSWORD")

  driverName: "postgres",
  dataSourceName: "postgresql://${username}:${password}@localhost",
  query:"SELECT * FROM table_name"

For more information about the sql.from method parameters, you can read the documentation.

Real-world use cases for using external data with InfluxDB

So now you’ve got an understanding of how you can join data from external sources with data stored inside InfluxDB using Flux. But what are some actual practical real-world use cases for this?

An obvious use would be to enhance your data visualizations using external data to provide more context for anybody viewing your charts or dashboards. Being able to pull in data like the actual human readable name for which room a sensor is located in is helpful compared to a relatively meaningless sensor IDl. Depending on your data set there are many different ways you could improve your data visualizations. A more advanced use case would be for making more flexible dashboards or templates. In this situation SQL queries can be used to create dashboard variables for dropdown lists and other fields which can make it faster and easier for end users to filter the data they are looking at without having to write their own query.

Another use case would be creating alerts based on external data. Each sensor might have a different defined threshold that can be used to compare the current value to. If that threshold is out of range, an automated action could be taken. For something like application monitoring, you might have the service owner’s contact information stored, and this data could be used to create an automated SMS or slack message.

Flux also provides the ability to output calculated data to an external database with the sql.to method. After running a calculation in Flux, you could export the result to be stored in a relational database.

Next steps

Hopefully this article gave you a decent introduction to how you can use Flux to combine the strengths of a time series database like InfluxDB with the relational database of your choice to get the best of both worlds. Below are some additional resources for readers who want to get a deeper understanding of InfluxDB and Flux:

  • InfluxDB Time to Awesome book - A detailed guide for learning all things InfluxDB. This free book will teach you how to structure your data inside InfluxDB, how to query it with Flux, and how to take advantage of the entire InfluxDB ecosystem
  • Telegraf - An open source tool for collecting, transforming, and storing data without having to write any code. Has over 250 input plugins and can output to over 40 different data stores.
  • What is time series data - A deep dive into what exactly time series data is and when you need it
  • Flux join tutorial - A more detailed look at how to join data using Flux from the docs
  • Time Series Databases explained - A resource explaining what a time series database is, when they should be used, and why they matter
  • How to Group, Shape, and Analyze data with Flux - This tutorial will teach you some more advanced features of Flux and provides some visualizations to help you see how you are transforming the data
  • How to Pivot and Join time series data with InfluxDB - Another tutorial showing off the various features of Flux that can be used to transform your data