How to Time Your Data Collection with Telegraf Agent Settings

Navigate to:

Many Telegraf and InfluxDB users often spend a lot of time finding that perfect balance of getting the data they want in while not writing in too much data that they have to deal with unnecessary data in their database. This blog post will give you a better understanding of Telegraf’s data collection settings and help you fine-tune your configuration.

You can use this post as a guide to help you easily set when and how often you want Telegraf to collect data from your device, helping you overcome endpoints that may have a constraint on the number of queries that can be performed.

This post will only cover the settings for data received by Telegraf. In another post, I’ll cover how data gets transmitted and sent out from Telegraf.

Set how often you want to gather metrics with interval

Interval is the setting to determine how often Telegraf should gather metrics and is the data collection interval for all inputs. The more often you want to gather your metrics, the lower you should configure your interval.

The interval setting can be set at the agent or plugin level. You would use it at the plugin level if one input should be run less or more often than others.

The default interval setting of 10 seconds collects your data every 10 seconds. If you wanted your data to collect twice as often, you would lower the setting to interval = "5s" for it to run at the 5, 10, 15, 20 and so on of every minute. The interval setting is rather straightforward but is also one of the most important configurations for the behavior of your data collection.

set interval

In the above graphic with these settings of interval = "10m" and round_interval = true, Telegraf collects data every 10 minutes on the rounded tens.

round_interval determines if Telegraf will round collection interval

The round_interval setting determines if Telegraf will round the collection interval. This setting will be set as boolean (true/false). In the previous example when round_interval was set to true, Telegraf would collect on the rounded 10 minutes. If the interval is set to “5m”, and round_interval is enabled then Telegraf will collect at :05, :10, :15, :20, etc. Timestamps are especially crucial when working with time series data. The round interval setting helps ensure your timestamps from all sources are aligned and uniform across all your Telegraf agents.

However if we change round_interval to false, Telegraf collects right when Telegraf starts and then continues at whatever interval is set. In the example below, we’ll set our round_interval = false and keep the same interval = "10m". If we start Telegraf at the 7th minute of the hour, Telegraf will immediately gather metrics and then continue collecting every 10 minutes on the 17, 27, 37, 47, 57 minutes of the hour.

round-interval

Offset collection by a random amount using collection_jitter

If a user has a large number of input metrics configured, adding some randomness to the collection process can help ensure targets do not get overloaded. You can use the collection_jitter for this exact use case to jitter the collection by a random amount. This setting can be used on the agent or input plugin level. If used at the plugin level, each plugin will sleep for a random time within the set jitter before collecting.

The range used for collection is between the interval and the interval plus or minus the jitter amount. So if we keep our interval to 10m and set collection_jitter = "2m", then each input will fire off randomly between the 8-12 minute intervals.

collection_jitter

Precisely schedule collection with collection_offset

The collection offset setting is the new agent setting that was added in Telegraf 1.22. The collection_offset setting is used to shift the collection by a given interval. This setting allows you to avoid many plugins querying constrained devices at the same time by manually scheduling them in time, as opposed to collection_jitter that can enable you to randomly offset in time.

If you have remote data that gets updated at the same time in the middle of the night, you can use collection_offset to schedule the collection precisely after it gets uploaded and continue doing so on a daily cadence.

In this example, we will maintain the interval = "10m" setting, return round_interval and collection_jitter to default, and set collection_offset = "6m". No matter what time we start Telegraf, we will collect data on the 6, 16, 26, 36, 46, 56 minute of every hour.

Collection_offset

Also as opposed to collection jitter, collection offset allows you to maintain a consistent and exact interval that Telegraf is collecting on. Collection offset provides you controlled data collection for predictable timestamps.

Set the precision of your timestamp

The precision setting doesn’t necessarily control how your data gets collected but it’s important viewing the granularity of that timed data. With the precision setting, you can set the precision of your timestamps by specifying the integer and unit (ex: 1ns, 1us, 1ms, and 1s). Your precision can be set anywhere from seconds down to nanoseconds. It is important to note that precision will not be used for service input plugins such as statsd.

Precision Example timestamp in unix format
1ns 1631202459121837307
1us 1631202459121837000
1ms 1631202459121000000
1s 1631202459000000000

Learn more!

Check out this video from Josh Powers where he discusses some of these settings along with more Telegraf agent settings.

If you have any questions on how to use these data collection settings in your Telegraf configuration, feel free to reach out on our community site or Slack. If you want to dive in deeper with Telegraf, InfluxDB University provides a full list of courses covering Telegraf, InfluxDB, Flux and so much more.