<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <title>InfluxData Blog - Alan Caldera</title>
    <description>Posts by Alan Caldera on the InfluxData Blog</description>
    <link>https://www.influxdata.com/blog/author/alan-caldera/</link>
    <language>en-us</language>
    <lastBuildDate>Sun, 01 Oct 2017 06:00:54 -0700</lastBuildDate>
    <pubDate>Sun, 01 Oct 2017 06:00:54 -0700</pubDate>
    <ttl>1800</ttl>
    <item>
      <title>Enriching Your Data with Kapacitor</title>
      <description>&lt;p&gt;From time to time, we have seen requests from the community around querying InfluxDB based on a business period such as the typical business day or broken down into shift periods. Consider the following request: How do I summarize my data for the entire month of August just for business hours, defined by Monday through Friday between 0800AM and 0500PM? InfluxQL does not currently possess any functions for filtering based on time. We are limited to:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;SELECT * FROM "mymeasurement" WHERE time &amp;gt;= '2017-08-01 08:00:00.000000' and time &amp;lt;= '2017-08-31 17:00:00.000000';&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So how do we accomplish this? The provided Telegraf plugins typically just send a timestamp value, and also have the ability to send some static tags in addition to the metrics and tags associated with the configured plugin. The solution is to use Kapacitor as a “pre-processor” to “decorate” or enrich your data with a computed value that represents a time period that you desire to query.&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/legacy-uploads/telegraf-kapacitor-influxdb-1-300x50.png" alt="" width="450" height="75" /&gt;&lt;/p&gt;

&lt;p&gt;For the purposes of this article we are running on &lt;code&gt;localhost&lt;/code&gt; for Telegraf, InfluxDB, and Kapacitor, but in a full-fledged environment these will be running on different hosts.&lt;/p&gt;

&lt;p&gt;The first step is to configure Telegraf to write to Kapacitor instead of directly to InfluxDB. In the &lt;code&gt;[[outputs.influxdb]]&lt;/code&gt; section of your &lt;code&gt;telegraf.conf&lt;/code&gt; file, there are 3 key settings to consider:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;[[outputs.influxdb]]
urls = [http://localhost:9092]
database = "kap_telegraf"
retention_policy = "autogen"&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The urls parameter must point to port 9092 (Kapacitor’s default listening port) instead of port 8086 for InfluxDB. The database parameter should point to a non-existent database (you can ignore Telegraf’s warning about the database not being found). The &lt;code&gt;retention_policy&lt;/code&gt; parameter should be set to “autogen” or a specific retention policy that you have previously created in your instance.&lt;/p&gt;

&lt;p&gt;CAUTION: Leaving &lt;code&gt;retention_policy&lt;/code&gt; set to “” (default) is not the same as “autogen” which is specified as the default retention policy when InfluxDB is initialized.&lt;/p&gt;

&lt;p&gt;All other settings in &lt;code&gt;telegraf.conf&lt;/code&gt; can be configured normally for your instance.&lt;/p&gt;

&lt;p&gt;The next step is to create a TICKscript that will process the data coming from Telegraf. In this example, we are interested in creating a tag that will contain a true or false value if the data point is during business hours that we described above.&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;stream
  |from()
   .database('kap_telegraf')
  |eval(lambda: if(((weekday("time") &amp;gt;= 1 AND weekday("time") &amp;lt;= 5) AND (hour("time") &amp;gt;= 8 AND (hour("time")*100+minute("time")) &amp;lt;= 1700)), 'true', 'false'))
     .as('business_hours')
     .tags('business_hours')
     .keep()
  |delete()
     .field('business_hours')
  |influxDBOut()
    .database('telegraf')
    .retentionPolicy('autogen')
    .tag('kapacitor_augmented','true')&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In this TICKscript, we are streaming from the non-existent database “kap_telegraf” that we configured above in our &lt;code&gt;telegraf.conf&lt;/code&gt; file. The &lt;code&gt;.from()&lt;/code&gt; method for the &lt;code&gt;stream()&lt;/code&gt; node only needs the database to match to. We then pass control to an &lt;code&gt;eval()&lt;/code&gt; node that will evaluate if the point has arrived in the window that we have designed. In this case, we utilize the &lt;code&gt;weekday()&lt;/code&gt;, &lt;code&gt;hour()&lt;/code&gt; and &lt;code&gt;minute()&lt;/code&gt; functions described at &lt;a href="https://docs.influxdata.com/kapacitor/v1.3/tick/expr/#time-functions" target="_blank" rel="noopener"&gt;https://docs.influxdata.com/kapacitor/v1.3/tick/expr/#time-functions&lt;/a&gt; to evaluate the “time” value. The first part of the condition evaluates whether the day of the week falls between Monday (1) and Friday (5). The second part of the condition evaluates the hour value, being careful to consider that we want to stop at the “00” mark in hour 17 (5pm). To do that, we are multiplying the hour by 100 and adding the result of the &lt;code&gt;minute()&lt;/code&gt; function to compare to the ending time of 1700. If the &lt;code&gt;time&lt;/code&gt; value falls within this range, the &lt;code&gt;eval()&lt;/code&gt; node returns true as a field called &lt;code&gt;business_hours&lt;/code&gt;. However, since we want to query on this value, we should have it as a tag, so we chain a &lt;code&gt;.tags()&lt;/code&gt; method to change the value to a tag called &lt;code&gt;business_hours&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;IMPORTANT: The &lt;code&gt;eval()&lt;/code&gt; node will eliminate all other fields and tags from the stream, so we want to specify &lt;code&gt;.keep()&lt;/code&gt; to retain these values from the stream.&lt;/p&gt;

&lt;p&gt;At this point, we have both a field and a tag called &lt;code&gt;business_hours&lt;/code&gt; that contains the output of the &lt;code&gt;eval()&lt;/code&gt; node. We should filter this out of the stream by calling a &lt;code&gt;delete()&lt;/code&gt; node that specifies the field to be deleted via the &lt;code&gt;.fields()&lt;/code&gt; method.&lt;/p&gt;

&lt;p&gt;Finally, we pass control in the stream to an &lt;code&gt;influxDBOut()&lt;/code&gt; node that specifies the destination database and retention policy to write to. We have added an additional static tag for this example called &lt;code&gt;kapacitor-augmented&lt;/code&gt;. All other data like measurement name is carried through, and it is only necessary to provide new information. In this case, we are going to write to the &lt;code&gt;telegraf &lt;/code&gt;database and the &lt;code&gt;autogen &lt;/code&gt;retention policy.&lt;/p&gt;

&lt;p&gt;Once you have created the TICKscript (referenced below as &lt;code&gt;businesshours.tick&lt;/code&gt;), we must tell Kapacitor that we want to run it. For this example, we will utilize the Kapacitor CLI to configure the task.&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;$ kapacitor define business_hours -type stream -dbrp kap_telegraf.autogen -tick business_hours.tick&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This command defines the task called &lt;code&gt;business_hours&lt;/code&gt; as a stream type listening for writes to &lt;code&gt;kap_telegraf.autogen&lt;/code&gt; which is the database name and retention policy name that was configured in Telegraf above. Once we have successfully created the task, we need to enable it for processing by Kapacitor.&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;$ kapacitor enable business_hours&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;To show the status, we can ask Kapacitor to list the current tasks:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;$ kapacitor list tasks
ID             Type      Status    Executing Databases and Retention Policies
business_hours stream    enabled   true      ["kap_telegraf"."autogen"]&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Once data begins the flow through Kapacitor to InfluxDB, you can then add your condition &lt;code&gt;AND business_hours='true'&lt;/code&gt; to the first query we specified:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;SELECT * FROM "mymeasurement" WHERE time &amp;gt;= '2017-08-01 08:00:00.000000' and time &amp;lt;= '2017-08-31 17:00:00.000000' AND business_hours='true';&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In summary, we have shown a simple example of how to use Kapacitor to enrich your data coming from Telegraf into InfluxDB. This method could be used to add other types of data, potentially obviating the need to create custom Telegraf plugins to meet your business needs.&lt;/p&gt;
</description>
      <pubDate>Sun, 01 Oct 2017 06:00:54 -0700</pubDate>
      <link>https://www.influxdata.com/blog/enriching-your-data-with-kapacitor/</link>
      <guid isPermaLink="true">https://www.influxdata.com/blog/enriching-your-data-with-kapacitor/</guid>
      <category>Product</category>
      <category>Developer</category>
      <author>Alan Caldera (InfluxData)</author>
    </item>
  </channel>
</rss>
