Recorded: January 2017
In this short video, Regan Kuchan will introduce the basics of InfluxQL, InfluxDB’s SQL-like query language, focusing on the GROUP by clause.
Watch the webinar “InfluxQL Shorts Part 4” by clicking on the download button on the right. This will open the recording.
Here is an unedited transcript of the webinar “InfluxQL Shorts Part 4.” This is provided for those who prefer to read than watch the webinar. Please note that the transcript is raw. We apologize for any transcribing errors.
• Regan Kuchan: Technical Writer, InfluxData
Regan Kuchan 00:02.111 Hi and welcome to Part 4 of the InfluxQL Shorts. In this series, we introduce the basics of InfluxQL, InfluxDB SQL-like query language. In this segment, we’re going to talk about using the Group By clause to group query results by tags. Here we have the complete InfluxQL select Statement. In the previous Shorts, we discussed the Select clause, the From clause, and the Where clause. For this segment, we’re going to focus on the group by clause. The Group By clause is optional, but it is a great clause for organizing and sorting query results by specific series and time intervals.
Regan Kuchan 00:41.202 Before getting into the syntax, there are a couple of things we want to highlight about the Group By clause. First, you can use the Group By clause to group query results by specific tag key values. This is great for breaking down results by individual tags in a measurement, and it allows you to dig into your data for more specific information. A second thing to remember is that the Group By clause does not work with fields. This is something to take into account when designing your schema in InfluxDB. If you expect to group query results by specific key value pairs, you probably want to make those data a tag. Finally, the Group By clause allows you to group query results into specific time ranges. I won’t be discussing that functionality in this segment. It’s a much longer topic and deserves a Short of its own.
Regan Kuchan 01:30.503 Like the other Shorts in the series, we’re going to be using sample time series data from the National Oceanic and Atmospheric Administration. We use this data throughout the influxQL documentation on our doc site. You can download the data and write them to your InfluxDB instance with the two commands on this slide. As always, these instructions are available in the docs at this link if you’d like to find them later and play around with the sample data set.
Regan Kuchan 01:58.881 Let’s start out with the hello world equivalent of the Group By clause using our sample data. This query selects values of the water level field from the H2O feet measurement, and it asks InfluxDB to group query results by the location tag. In the results below, you can see that InfluxDB returns results in separate tables that are grouped by the different values of the location tag. That first table returns the values of water level where the tag location equals Coyote Creek. And the second table shows the values of water level where the tag location equals Santa Monica.
Regan Kuchan 02:36.761 Here’s another group by query that’ll probably be a little bit more like the queries that you’ll actually be running. This one uses an InfluxQL function to calculate the maximum value of the water level field from the H2O feet measurement, and it asks InfluxDB to calculate the maximum value for every tag value of the location tag. Again, we end up with two tables of results. The first table reports the maximum water level where the location equals Coyote Creek, and the second reports the maximum water level where the location equals Santa Monica. For more on functions, you can check out this doc which goes into a lot of detail about InfluxQL functions.
Regan Kuchan 03:20.205 Now that you’ve seen how the Group By clause works in general, let’s get into some of the specific syntax for specifying tags with this useful clause. The first option is something you’ve already seen. You can group results by a single tag by specifying that tag after Group By. It’s a good thing to note that the clause order does matter in InfluxQL. So, if you’re going to include a Group By clause in your query, it does need to appear after the select, into, from, and where clauses. In this example, we select the first value of the index field from the H2O quality measurement, and we group query results by the ran tag tag.
Regan Kuchan 04:01.580 The results show the first values of index where ran tag equals one, then two, and then three. The next syntax option is to specify more than one tag in the Group By clause. You can do this by separating the tag keys by a comma. In this example, we select the first value of the index field from the H2O quality measurement, and we group query results by two tags: ran tag and location. The query results show the first value of the index field for every possible combination of the ran tag and location tags. This query returns quite a few results so let’s scroll down to take a look. The first table in the results is the first value of the index field where location equals Coyote Creek and ran tag equals one. You can see from the rest of the results that there are six different combinations of the ran tag and location tags.
Regan Kuchan 05:02.668 Another syntax option is to specify the relevant tag keys with a regular expression. In InfluxQL, regular expressions are surrounded by forward slashes and they use Golang’s regular expression syntax. In this example, the query selects the first value of the index field in the H2O quality measurement, and it groups results by all tags whose tag key includes the letter L. In the results, you can see that location is the only tag key that satisfies that requirement. And as we said before, location has two tag values, Coyote Creek and Santa Monica.
Regan Kuchan 1 05:42.601 The final syntax option is to Group By star. Grouping by star tells InfluxDB to group results by every possible tag combination in the measurement. In this example, we select the first value of the index field in the H2O quality measurement, and we group results by every possible tag combination in the H2O quality measurement. From the results, you can see that there are just two tag keys in the H2O quality measurement. So these results are actually the same as if we’d written out the ran tag tag key and the location tag key in the Group By clause.
Regan Kuchan 06:20.314 In this last slide, I’ll go over one of the common issues that we see with the Group By clause. This issue describes what happens when queries attempt to Group By a field. As I mentioned at the beginning of this Short, the Group By clause doesn’t support grouping query results by fields. And it’s good to note that if you do include a field in your Group By clause, InfluxDB will not return an error. In this example, we attempt to Group By the level description field key. InfluxDB attempts to find a tag key called level description, and it doesn’t find one. So it returns all values of water level when the tag key level description is blank, which is all values of water level because the level description tag key doesn’t exist in our schema. So just know that InfluxDB will not return an error if you attempt to group query results by fields.
Regan Kuchan 07:11.808 And that’s it for this InfluxQL Short. If you’d like to find out more about the Select Statement and the Group By clause, check out the data exploration page in the documentation. It covers what I just talked about and more. In the next Short, we’ll go into detail about using the Group By clause to group query results by time intervals. Happy querying.