Recorded: December 2016
In this short video, Regan Kuchan will introduce the basics of InfluxQL, InfluxDB’s SQL-like query language, focusing on the WHERE clause & filtering data based on timestamps.
Watch the webinar “InfluxQL Shorts Part 3” by clicking on the download button on the right. This will open the recording.
Here is an unedited transcript of the webinar “InfluxQL Shorts Part 3.” This is provided for those who prefer to read than watch the webinar. Please note that the transcript is raw. We apologize for any transcribing errors.
• Regan Kuchan: Technical Writer, InfluxData
Regan Kuchan 00:01.188 Hi, and welcome to part three of the InfluxQL Shorts. In this series, we introduce the basics of InfluxQL, InfluxDB SQL-like query language. In this segment, we’re going to talk about using the WHERE clause to filter data based on timestamps. Here we have the complete InfluxQL select statement. In the previous short we discussed filtering tags and fields in the WHERE clause. In this segment, we’ll also be focusing on the WHERE clause. But we’ll go through how to filter data based on timestamps. You’ll remember from the previous short that the WHERE clause is optional, but it is essential to writing efficient InfluxQL queries. Before getting into the syntax there are a couple things you should remember and know about the WHERE clause. First, in the previous InfluxQL Short, we discussed using the WHERE clause to filter data based on specific field key value pairs and tag key value pairs. Filtering data on tags and fields makes your queries more efficient and they’ll return faster. Another option for restricting query data coverage, which is the topic of this short, is to filter data based on timestamps. The default time range for most queries is between a day in 1677 and a day in 2262 UTC. Restricting that time range will do good things for your query response times.
Regan Kuchan 01:26.323 In the next slides, I’m going to be using sample time-series data from the National Oceanic and Atmospheric Administration. We use these data throughout the InfluxQL documentation on our doc site. You can download the data and write them to your InfluxDB instance with the two commands on this slide. These instructions are also available at the docs at this link if you’d like to find them later and play around with the sample data set. Let’s start with a couple simple select statements to give you an idea of what WHERE time clauses can do. On this slide, we have a basic select statement without a WHERE clause. It returns all fields and tags from the H2O feet measurement. The next query includes a WHERE clause that filters data on absolute time, so an existing timestamp. Here we ask for all data in the H2O feet measurement with a timestamp that matches August 18th, 2015 at 12 minutes after midnight. You can see in the query results that two data points meet that criteria. Another way to restrict time ranges in the WHERE clause is to use relative time, that is specified timestamps that occur with respect to the current time, so, now. The query asks for all data in the H2O feet measurement with timestamps that occur within the past 452 days. Notice that this query can return different results depending on when you run it. Now is always changing, so the query results can too. And the final introductory query serves to show that you can use relative time and absolute time in the same WHERE clause. So this query asks for all data from the H2O feet measurement with timestamps that occur within the past 452 days and with timestamps that are less than September 18th, 2015 at 7:54 PM.
Regan Kuchan 03:28.219 Now that you’ve seen how it works in general, let’s get into some of the specific syntax for working with absolute time in the WHERE clause. The first option is to use the RFC3339 formatted timestamps. I’ll call this the RFC timestamp because those numbers aren’t the easiest thing to say over and over. Anyway, the RFC timestamp has a very specific format. The first part is the date. So the four digit year, a dash, the two digit month, a dash, and finally, the two digit day. The date is followed by a T, and then the time. The time is the two digit hour, a colon, the two digit minute, a colon, the two digit second, followed by a period, the nine digit nanosecond, and finally, the letter Z. The nine digit nanosecond is actually optional. So this is a valid timestamp. In this case, InfluxDB assumes a zero nanoseconds. The RFC timestamp must be single quoted. Here’s an example of a WHERE clause that includes an RFC timestamp. We select all values of water level in the H2O feet measurement with timestamps that occur before August 18th, 2015, six minutes after midnight. Another option for specifying absolute time is to use an RFC-like timestamp. This format is almost the same as the format we saw before, but this time, you don’t have the T between the date and the time, and we don’t have the Z at the end of the timestamp. Once again, specifying nanoseconds is optional. If you don’t specify the nanoseconds, InfluxDB assumes zero nanoseconds. Unlike the official RFC timestamp, you do not need to specify a time in the RFC-like timestamp. If you only specify the date, InfluxDB assumes that the time is midnight. Notice that like the RFC timestamp, the RFC-like timestamp must be single quoted. The example on this slide uses the RFC-like timestamp to select all values of water level in the H2O feet measurement with timestamps that occur after September 18th, 2015 at 9:30 PM. In the results, you can see that two data points satisfy the requirement.
Regan Kuchan 05:38.879 The final option for specifying absolute time is to use epoch time. Epoch time is defined as the number that represents the amount of time that has elapsed since midnight UTC, Thursday, the 1st of January, 1970. Here are a couple examples. The first is the number of nanoseconds that have passed, and it includes the ns duration literal. That ns duration literal is optional. InfluxDB will assume that your timestamp is at the nanosecond precision if you do not include a duration literal with your epoch timestamp. That’s what the second example is showing there. The last example is a second precision timestamp. So this one indicates the number of seconds that have passed since January 1st, 1970. It’s important to note that if you forget that s duration literal, InfluxDB will assume that the timestamp is in nanoseconds. That means InfluxDB will query data with very different timestamps. This example selects all water level values with timestamps that occur before that nanosecond timestamp.
Regan Kuchan 06:40.777 Okay. Now, we’ve covered all the formats for absolute time. We’ll be happy to know that there’s just one format for specifying relative time in the WHERE clause. To work with relative time, we’ll use now. Now is the Unix time of the server at the time that you execute the query. In this example, we select the values of water level in the H2O feet measurement with timestamps that occur before now. Another thing you can do with timestamps is perform basic arithmetic. Add and subtract time from timestamps by providing the duration and the relevant duration unit. In the three examples on this slide, we add 6 minutes to an RFC timestamp, subtract 2 hours from a second precision epoch timestamp, and subtract 452 days from now or the current time. The valid duration units include, going through the list, nanoseconds, microseconds, milliseconds, seconds, minutes, hours, days, and weeks. If you don’t include the duration unit, InfluxDB assumes that it is in nanoseconds. And note that that space between the operator and the duration is required.
Regan Kuchan 07:52.626 In this last syntax slide, we’ll walk through all of the valid operators for working with timestamps in the WHERE clause. You can use equals, not equals, another way to say not equals, greater than and less than, greater than and equal to, or less than or equal to. And you can also use and to combine conditions. In these last few slides, we’ll go through some of the common issues that we see with time syntax in the WHERE clause. The first common issue is if you forget to single quote the RFC or RFC-like timestamps. If you double quote them, you’ll get an invalid operation error, and if you forget to quote them altogether, you’ll get a query parsing error. So don’t forget the single quotes. The second common issue is what happens if you attempt to select disparate time ranges. So if you try to select a time range that includes a gap in time. This query attempts to select two separate time ranges. The first in on August 18, 2015, between midnight and six minutes after midnight, and the second is on September 18, 2015, between 9:36 PM and 9:42 PM. If you attempt to do something like this, your query will return no results, and you will not receive an error. Currently, InfluxDB doesn’t support that kind of time range in the WHERE clause.
Regan Kuchan 09:21.351 The final common issue isn’t really with the query language, but it’s a pretty common question that we get. It’s how to configure the timestamp format that you get in your query response. If you’re using InfluxDB’s command line interface, the CLI, the default timestamp format is epoch. You can change that by entering precision RFC3339, and that will make the return timestamps understandable to the human eye. Next, if you’re using the HTTP API, the default timestamp format is RFC3339. You can change that by using the epoch query string parameter and specifying the relevant precision. So here, I asked for timestamps to return in epoch in seconds.
Regan Kuchan 10:06.668 That’s it for this InfluxQL Short. If you’d like to find out more about the select statement and time ranges in the WHERE clause, check out the data exploration page in the documentation. It covers what I just talked about and more. The next Short we’ll go into detail about the incredibly useful GROUP BY clause. Happy querying.