How to Pivot Your Data in Flux: Working with Columnar Data

By Michael Hall / Product, Use Cases, Developer
Oct 12, 2021

Navigate to:

Relational databases are by far the most common type of database, and as software developers it’s safe to say that they are the kind of database most of us got started on, and probably still use on a regular basis. And one thing that they all have in common is the way they structure data. InfluxDB, however, structures data a little bit differently.

RDBMS default table structure

With a relational database you have a collection of tables, each with its own fixed layout of columns, and you read and write data in rows. Each row has a column that holds a value for each field, and you get all of those values at the same time, giving you a table that looks like this:

time	field1	field2	host
<data_tstamp>	123	789	myhost1
<data_tstamp>	456	101	myhost2

InfluxDB result data

InfluxDB doesn’t follow this pattern. It returns data in what’s known as a columnar format, which just means that you will get the data one column at a time, rather than one row at a time. This is often a source of confusion for developers who are used to the row format they worked with from a relational database, but it has many benefits when working with time series data.

When you run a basic Flux query like this:

from(bucket: "telegraf")
  |> range(start: -1d)

You will get data returned like this:

table	_measurement*	_field*	_value	_start*	_stop*	_time	host*
0	mydata	field1	123	<query_start>	<query_stop>	<data_tstamp>	myhost1
0	mydata	field1	456	<query_start>	<query_stop>	<data_tstamp>	myhost2
1	mydata	field2	789	<query_start>	<query_stop>	<data_tstamp>	myhost1
1	mydata	field2	101	<query_start>	<query_stop>	<data_tstamp>	myhost2

The first thing you’ll notice is that the table’s column headers are NOT your field names like they would be if you were querying a SQL database. Instead your query will return your results with these same columns, and each record (represented by r in the query below) in the result will represent a single field’s value at that point in time.

This is why when you filter your Flux query, it looks like this:

|> filter(fn: (r) => r["_measurement"] == "mydata")
|> filter(fn: (r) => r["_field"] == "field1")

This data structure allows for faster queries and aggregations in the most common usage patterns for InfluxDB. Because InfluxDB is processing one column’s values at a time, it can perform calculations on the data for just that field instead of performing multiple calculations at the same time.

Pivoting between column-like and row-like data layouts

There are times, however, when you are going to want your data to look more like the results of a traditional database query. To do that you need to perform a “pivot” operation. Pivoting data is the act of flipping it from having the fields in separate rows to having the fields in separate columns.

Flux provides a pivot function that does precisely this:

|> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")

This function will flip your data so that every row represents a moment in time (rowKey:["_time"]), your fields will each become a column (columnKey: ["_field"]), and the value in each column will be your field’s value (valueColumn: "_value"). Giving you a result that looks like this:

table	_measurement*	_start*	_stop*	_time	host*	field1	field2
0	mydata	<query_start>	<query_stop>	<data_tstamp>	myhost1	123	789
0	mydata	<query_start>	<query_stop>	<data_tstamp>	myhost2	456	101

Using fieldsAsCols instead

This is the most common way people want to pivot their data, because it puts it into the same shape as they’re used to working with from a traditional database. It’s so common, in fact, that we provide a shortcut function in the schema package just for that use case.

After importing schema into your Flux script, the above line of code can be replaced with:.

Flux
  |> fieldsAsCols()

Other ways to pivot

Tags as columns

Even though it’s the most common way to pivot, fields are not the only thing you can flip up into column headers. Suppose you want to compare values not against different fields, but against the same field from different sources. For example, if you want to compare readings from different server hosts taken at the same time.

In our previous example, our data had a tag called host that contained this value.

table	_measurement*	_field*	_value	_start*	_stop*	_time	host*
0	mydata	field1	123	<query_start>	<query_stop>	<data_tstamp>	myhost1
0	mydata	field1	456	<query_start>	<query_stop>	<data_tstamp>	myhost2
1	mydata	field2	789	<query_start>	<query_stop>	<data_tstamp>	myhost1
1	mydata	field2	101	<query_start>	<query_stop>	<data_tstamp>	myhost2

If we wanted to combine the values of each field from all of our hosts into a single record, we could do that with:

|> group()
  |> pivot(rowKey:["_time", "_field"], columnKey: ["host"], valueColumn: "_value")

The first line re-groups our data because InfluxDB by default will put values from different hosts (or any difference in tag values) into separate result tables. Since pivot works on one result table at a time, we have to call group first to put the values from both hosts into a single table, then we can pivot that combined table around the host values.

This query gives us a table structure that looks like this:

table	_field	_start	_stop	_time	myhost1	myhost2
0	field1	<query_start>	<query_stop>	<data_tstamp>	123	456
0	field2	<query_start>	<query_stop>	<data_tstamp>	789	101

Combined columns

You can also use pivot to combine field names with tag values. Building off the above example, suppose you want to see the values of both field1 and field2 for each host in a single row?

We can do that with a slight modification to our previous function call:

|> group()
  |> pivot(rowKey:["_time"], columnKey: ["host", "_field"], valueColumn: "_value")

By moving the _field data from the rowKey to the columnKey, along with the host data, we get a table that combines both into unique columns.

table	_start	_stop	_time	myhost1_field1	myhost1_field2	myhost2_field1	myhost2_field2
0	<query_start>	<query_stop>	<data_tstamp>	123	789	456	101

Navigate to:

Try InfluxDB Cloud

Stop flying blind

How to Pivot Your Data in Flux: Working with Columnar Data

By Michael Hall / Product, Use Cases, Developer
Oct 12, 2021

Navigate to:

RDBMS default table structure

InfluxDB result data

Pivoting between column-like and row-like data layouts

Using fieldsAsCols instead

Other ways to pivot

Tags as columns

Combined columns

Further reading

Try InfluxDB Cloud

Learn more about InfluxDB

Performance Benchmarking: InfluxDB 3.0 vs. InfluxDB Open Source

InfluxDB for Industrial IoT:  
A Live Demonstration

How Time Series Databases and Data Lakes Work Together

Data Warehousing

Network Monitoring

Time Series Data Analysis: Definitions and Best Techniques in 2024

Product & Solutions

Developers

Company

Data Workloads

Navigate to:

Try InfluxDB Cloud

Stop flying blind

Get Updates

How to Pivot Your Data in Flux: Working with Columnar Data

By Michael Hall / Product, Use Cases, Developer Oct 12, 2021

Navigate to:

RDBMS default table structure

InfluxDB result data

Pivoting between column-like and row-like data layouts

Using fieldsAsCols instead

Other ways to pivot

Tags as columns

Combined columns

Further reading

Try InfluxDB Cloud

Learn more about InfluxDB

Performance Benchmarking: InfluxDB 3.0 vs. InfluxDB Open Source

InfluxDB for Industrial IoT: A Live Demonstration

How Time Series Databases and Data Lakes Work Together

Data Warehousing

Network Monitoring

Time Series Data Analysis: Definitions and Best Techniques in 2024

Product & Solutions

Developers

Company

Sign up for the InfluxData newsletter

Follow Us

By Michael Hall / Product, Use Cases, Developer
Oct 12, 2021

InfluxDB for Industrial IoT:  
A Live Demonstration