<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <title>InfluxData Blog - Margo Schaedel</title>
    <description>Posts by Margo Schaedel on the InfluxData Blog</description>
    <link>https://www.influxdata.com/blog/author/margo-schaedel/</link>
    <language>en-us</language>
    <lastBuildDate>Wed, 03 Oct 2018 09:00:49 -0700</lastBuildDate>
    <pubDate>Wed, 03 Oct 2018 09:00:49 -0700</pubDate>
    <ttl>1800</ttl>
    <item>
      <title>Visualizing Time Series Data with Dygraphs</title>
      <description>&lt;h2&gt;&lt;img class="aligncenter size-large wp-image-219282" src="/images/legacy-uploads/m-b-m-795202-unsplash-1024x683.jpg" alt="graph on ipad" width="1024" height="683" /&gt;&lt;/h2&gt;
&lt;h2&gt;Overview&lt;/h2&gt;
&lt;p&gt;This post will walk through how to visualize dynamically updating time series data that is stored in &lt;a href="https://w2.influxdata.com/time-series-platform/influxdb/"&gt;InfluxDB&lt;/a&gt; (a &lt;a href="https://en.wikipedia.org/wiki/Time_series_database"&gt;time series database&lt;/a&gt;), using the JavaScript graphing library: &lt;a href="http://dygraphs.com/"&gt;Dygraphs&lt;/a&gt;. If you have a preference for a specific visualization library, check out these other graphical integration posts using various libraries—&lt;a href="https://w2.influxdata.com/blog/data-visualizations-with-influxdb-integrating-plotly-js/"&gt;plotly.js&lt;/a&gt;, &lt;a href="https://w2.influxdata.com/blog/visualizing-your-time-series-data-from-influxdb-with-rickshaw/"&gt;Rickshaw&lt;/a&gt;, &lt;a href="https://w2.influxdata.com/blog/visualizing-data-with-highcharts/"&gt;Highcharts&lt;/a&gt;, or you can always build out a dashboard in our very own &lt;a href="https://w2.influxdata.com/time-series-platform/chronograf/"&gt;Chronograf&lt;/a&gt;, which is designed exclusively for InfluxDB.&lt;/p&gt;
&lt;h2&gt;Prep and Setup&lt;/h2&gt;
&lt;p&gt;To begin with, we’ll need some sample data to display on screen. For this example, I’ll be using the data generated from a separate &lt;a href="https://w2.influxdata.com/blog/getting-started-streaming-data-into-influxdb/"&gt;tutorial&lt;/a&gt; written by DevRel Anais Dotis-Georgiou on using the Telegraf &lt;a href="https://github.com/influxdata/telegraf/tree/master/plugins/inputs/exec"&gt;exec&lt;/a&gt; or &lt;a href="https://github.com/influxdata/telegraf/tree/master/plugins/inputs/tail"&gt;tail&lt;/a&gt; plugins to collect &lt;a href="https://en.wikipedia.org/wiki/Bitcoin"&gt;Bitcoin&lt;/a&gt; price and volume data and see it trend over time. I’ll then query for the data in InfluxDB periodically using the HTTP API on the frontend. Let’s get started!&lt;/p&gt;

&lt;p&gt;Depending on whether you want to pull in Dygraphs as a script file into your index.html file or import the npm module, you can find all the relevant instructions &lt;a href="http://dygraphs.com/download.html"&gt;here&lt;/a&gt;. I added several script tags into my index.html file for ease of reference in this case:&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;&amp;lt;!DOCTYPE html&amp;gt;
&amp;lt;html lang="en" dir="ltr"&amp;gt;
  &amp;lt;head&amp;gt;
    &amp;lt;meta charset="utf-8"&amp;gt;
    &amp;lt;title&amp;gt;Dygraphs Sample&amp;lt;/title&amp;gt;
    &amp;lt;link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/dygraph/2.1.0/dygraph.min.css" /&amp;gt;
    &amp;lt;link rel="stylesheet" type="text/css" href="styles.css"&amp;gt;
  &amp;lt;/head&amp;gt;
  &amp;lt;body&amp;gt;
    &amp;lt;div id="div_g"&amp;gt;&amp;lt;/div&amp;gt;
  &amp;lt;/body&amp;gt;
  &amp;lt;script src="https://ajax.googleapis.com/ajax/libs/jquery/3.1.1/jquery.min.js"&amp;gt;&amp;lt;/script&amp;gt;
  &amp;lt;script src="https://cdnjs.cloudflare.com/ajax/libs/dygraph/2.1.0/dygraph.min.js"&amp;gt;&amp;lt;/script&amp;gt;
  &amp;lt;script type="text/javascript" src="script.js"&amp;gt;&amp;lt;/script&amp;gt;
&amp;lt;/html&amp;gt;&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Querying InfluxDB&lt;/h2&gt;
&lt;p&gt;Ensure your local instance of InfluxDB is running (you can get all the components of the &lt;a href="https://w2.influxdata.com/time-series-platform/"&gt;TICK Stack&lt;/a&gt; set up locally or spin up the stack the &lt;a href="https://github.com/influxdata/sandbox"&gt;sandbox&lt;/a&gt; way) and that Telegraf is collecting Bitcoin stats by running &lt;code class="language-markup"&gt;SELECT "price" FROM "exec"."autogen"."price" WHERE time &amp;gt; now() - 12h&lt;/code&gt; in your Influx shell (you can access the Influx shell, with the command &lt;code class="language-markup"&gt;influx&lt;/code&gt;). With time series data, you always want to scope your queries, so rather than running a &lt;code class="language-markup"&gt;SELECT * from exec&lt;/code&gt;, we are limiting our results here by selecting specifically for price and limiting by time (12 hrs).&lt;/p&gt;

&lt;p&gt;You should receive at least one result when running this query, depending on how long your Telegraf instance has been running and collecting stats via one of the plugins from the tutorial. Alternatively, you can navigate to your local &lt;a href="https://docs.influxdata.com/chronograf/v1.6/introduction/getting-started/"&gt;Chronograf&lt;/a&gt; instance and verify that you’re successfully collecting data via the Data Explorer page, which has an automatic query builder.&lt;/p&gt;
&lt;h2&gt;Fetching the Data from InfluxDB&lt;/h2&gt;
&lt;p&gt;In your script file, you’ll want to fetch the data from InfluxDB using the HTTP API, like so:&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;const fetchData = () =&amp;gt; {
  return fetch(`http://localhost:8086/query?db=exec&amp;amp;q=SELECT%20"price"%20FROM%20"price"`)
    .then( response =&amp;gt; {
      if (response.status !== 200) {
        console.log(response);
      }
      return response;
    })
    .then( response =&amp;gt; response.json() )
    .then( parsedResponse =&amp;gt; {
      const data = [];
      parsedResponse.results[0].series[0].values.map( (elem, i) =&amp;gt; {
        let newArr = [];
        newArr.push(new Date(Date.parse(elem[0])));
        newArr.push(elem[1]);
        data.push(newArr);
      });
      return data;
    })
    .catch( error =&amp;gt; console.log(error) );
}&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Constructing the Graph&lt;/h2&gt;
&lt;p&gt;We can construct the graph using the Dygraphs constructor function as follows:&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;const drawGraph = () =&amp;gt; {
  let g;
  Promise.resolve(fetchData())
    .then( data =&amp;gt; {
      g = new Dygraph(
        document.getElementById("div_g"),
        data,
        {
          drawPoints: true,
          title: 'Bitcoin Pricing',
          titleHeight: 32,
          ylabel: 'Price (USD)',
          xlabel: 'Date',
          strokeWidth: 1.5,
          labels: ['Date', 'Price'],
        });
    });

  window.setInterval( () =&amp;gt; {
    console.log(Date.now());
    Promise.resolve(fetchData())
      .then( data =&amp;gt; {
        g.updateOptions( { 'file': data } );
      });
  }, 300000);
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;What’s happening in the drawGraph function is that after fetching the data from InfluxDB, we create a new Dygraph, by targeting the element within which to render the graph, add the data array, and add in our &lt;a href="http://dygraphs.com/options.html"&gt;options object&lt;/a&gt; as the third argument. In order to dynamically update the graph over time, we add a &lt;a href="https://developer.mozilla.org/en-US/docs/Web/API/WindowOrWorkerGlobalScope/setInterval"&gt;setInterval&lt;/a&gt; method to fetch new data every five minutes (unfortunately, any calls more often than that require a paid subscription to the Alpha Vantage API for Bitcoin pricing) and use the updateOptions method to bring in new data.&lt;/p&gt;

&lt;p&gt;&lt;img class="wp-image-219269 aligncenter" src="/images/legacy-uploads/visualizing-data-with-dygraphs.png" alt="" width="629" height="381" /&gt;&lt;/p&gt;
&lt;h2&gt;Summary&lt;/h2&gt;
&lt;p&gt;If you’ve made it this far, I applaud you. Feel free to check out the &lt;a href="https://github.com/mschae16/dygraphs-sample"&gt;source code&lt;/a&gt; for a little side-by-side comparison. Additionally, Dygraphs has a &lt;a href="http://dygraphs.com/gallery/"&gt;gallery of demos&lt;/a&gt; available if you want to experiment with a myriad of styles. We want to hear all about your creations! Look for us on Twitter: &lt;a href="https://twitter.com/mschae16"&gt;@mschae16&lt;/a&gt; or &lt;a href="https://twitter.com/influxdb"&gt;@influxDB&lt;/a&gt;.&lt;/p&gt;
</description>
      <pubDate>Wed, 03 Oct 2018 09:00:49 -0700</pubDate>
      <link>https://www.influxdata.com/blog/visualizing-time-series-data-with-dygraphs/</link>
      <guid isPermaLink="true">https://www.influxdata.com/blog/visualizing-time-series-data-with-dygraphs/</guid>
      <category>Product</category>
      <category>Use Cases</category>
      <category>Developer</category>
      <author>Margo Schaedel (InfluxData)</author>
    </item>
    <item>
      <title>Metrics to Monitor in Your PostgreSQL Database</title>
      <description>&lt;p&gt;&lt;img class="alignnone wp-image-218219 aligncenter" src="/images/legacy-uploads/potgresql-metrics-to-monitor.jpg" alt="" width="321" height="255" /&gt;&lt;/p&gt;
&lt;h2&gt;Overview&lt;/h2&gt;
&lt;p&gt;Last month I wrote a &lt;a href="https://w2.influxdata.com/blog/monitoring-your-postgresql-database-with-telegraf-and-influxdb/"&gt;guide on how to monitor your PostgreSQL database using Telegraf and InfluxDB&lt;/a&gt;, and though I was able to cover a walkthrough of how to monitor PostgreSQL, I didn’t have a chance to cover what exactly you should be looking at when tracking the health of your database. There are several key metrics you’ll definitely want to keep track of when it comes to database performance, and they’re not all database-specific. For example, this &lt;a href="https://w2.influxdata.com/blog/mysql-metrics-that-matter/"&gt;blog post&lt;/a&gt; on MySQL database metrics gives a great introduction and overview to get you started in the monitoring scene.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.postgresql.org/docs/10/static/monitoring-stats.html"&gt;PostgreSQL’s statistics collector&lt;/a&gt; automatically gathers a substantial number of statistics about its own activity. In the previous post we saw that the &lt;a href="https://github.com/influxdata/telegraf/tree/master/plugins/inputs/postgresql"&gt;Telegraf plugin for PostgreSQL&lt;/a&gt; pulls data from two of these built-in views: &lt;code class="language-markup"&gt;pg_stat_database&lt;/code&gt; and &lt;code class="language-markup"&gt;pg_stat_bgwriter&lt;/code&gt;. If you want to pull in data from additional views, you should definitely check out this &lt;a href="https://github.com/influxdata/telegraf/tree/master/plugins/inputs/postgresql_extensible"&gt;extended Telegraf plugin&lt;/a&gt;. In this post, we’ll take a more thorough look at the significance of these stats as an indicator of your PostgreSQL database health.&lt;/p&gt;
&lt;h2&gt;The &lt;em&gt;pg_stat_database&lt;/em&gt; View&lt;/h2&gt;
&lt;p&gt;&lt;img class="alignnone size-full wp-image-218221" src="/images/legacy-uploads/pg_stat_database-view.png" alt="" width="1600" height="607" /&gt;&lt;/p&gt;

&lt;p&gt;The &lt;code class="language-markup"&gt;pg_stat_database&lt;/code&gt; view records information concerning each database within a given cluster, including the database id (&lt;code class="language-markup"&gt;datid&lt;/code&gt;); number of backends actively connected to the database (&lt;code class="language-markup"&gt;numbackends&lt;/code&gt;); commits and rollbacks; disk blocks read and shared buffer cache hits; rows fetched, inserted, updated, and deleted; conflicts and deadlocks; temporary files created; and duration times spent reading and writing data.&lt;/p&gt;
&lt;h2&gt;The &lt;em&gt;pg_stat_bgwriter&lt;/em&gt; View&lt;/h2&gt;
&lt;p&gt;&lt;img class="alignnone size-full wp-image-218222" src="/images/legacy-uploads/pg_stat_bgwriter-view.png" alt="" width="1600" height="397" /&gt;&lt;/p&gt;

&lt;p&gt;The &lt;code class="language-markup"&gt;pg_stat_bgwriter&lt;/code&gt; view supplies information about the checkpoint process in order to determine how much load is being placed on the database while it’s updating or replicating files. The variables cover the number of total checkpoints occurring across all databases in the cluster—both scheduled and requested checkpoints—in addition to the amount of time spent in checkpoint processing. The &lt;code class="language-markup"&gt;buffers_checkpoint&lt;/code&gt;, &lt;code class="language-markup"&gt;buffers_clean&lt;/code&gt;, and &lt;code class="language-markup"&gt;buffers_backend&lt;/code&gt; indicate how the buffers were written to disk.&lt;/p&gt;
&lt;h2&gt;The Basics - Resource Utilization&lt;/h2&gt;
&lt;p&gt;In order for anything to be written, updated, and queried within PostgreSQL, the database needs to have adequate resources with which to achieve these tasks successfully. PostgreSQL, like other databases out there, relies heavily on various system resources such as CPU, network bandwidth, disk space/disk utilization, and RAM. Therefore, having insight into these system metrics and others like disk IOPS, swap space, and network errors can generally provide a good indication of the health of your overall database.&lt;/p&gt;

&lt;p&gt;A few other metrics you may want to keep tabs on that PostgreSQL collects information on include connections, shared buffer usage, and disk usage. Tracking variables like &lt;code class="language-markup"&gt;numbackends&lt;/code&gt; in relation to &lt;code class="language-markup"&gt;max_connections&lt;/code&gt; (the &lt;code class="language-markup"&gt;pg_settings&lt;/code&gt; view) can draw attention to possible issues with slower queries and applications having to create new connections in order to carry out requests rather than using already active connections. You would rather keep a small pool of connections alive than have to constantly start up new ones and terminate idle ones.&lt;/p&gt;

&lt;p&gt;Keeping an eye on shared buffer usage can be significant for reading or updating data. The shared buffer cache is where PostgreSQL will check first when executing a request, and if the block is not found there, it will then need to grab the data from disk, after which the data will be cached in the database’s shared buffer cache and possibly the OS cache. This allows for subsequent querying of that data without needing to access it on disk. However, the downside to this is that some data could end up cached in several places at once. Keep an eye on &lt;code class="language-markup"&gt;blks_hit&lt;/code&gt; and &lt;code class="language-markup"&gt;blks_read&lt;/code&gt;, which represent shared buffer hits and blocks read from disk, but also keep in mind that data sometimes gets saved in the OS cache, which PostgreSQL doesn’t report on.&lt;/p&gt;

&lt;p&gt;Lastly, gathering information about the database’s disk usage (see &lt;code class="language-markup"&gt;pg_table_size&lt;/code&gt; or &lt;code class="language-markup"&gt;pg_indexes_size&lt;/code&gt;) can help to illuminate possible problems with query performance. There is a direct relationship between the two—as tables and indexes increase in size, queries will inevitably take longer, resulting in a need to allocate more disk space. A sudden rise in table or index size can also hint at problems with the VACUUM process (the process of cleaning up and removing dead rows—read more on that below).&lt;/p&gt;
&lt;h2&gt;Read/Write Throughput&lt;/h2&gt;
&lt;p&gt;Monitoring read and write query throughput helps to ascertain that your applications are able to both add data to the database and access it as well. Issues arising in this area can often lead to problems in other parts of the database, especially with regards to replication and reliability. In order to ensure availability, it’s not a bad idea to keep an eye on your reads and writes.&lt;/p&gt;

&lt;p&gt;Take a look at &lt;code class="language-markup"&gt;tup_returned&lt;/code&gt;, the number of rows read or scanned versus &lt;code class="language-markup"&gt;tup_fetched&lt;/code&gt;, the number of rows fetched that contained data necessary to execute the query successfully. These two variables should consistently stay pretty close in number, which would point to the database carrying out read queries efficiently, since it wouldn’t be scanning through extra rows to satisfy the query requirements. Additionally, you may want to track &lt;code class="language-markup"&gt;temp_files&lt;/code&gt; and &lt;code class="language-markup"&gt;temp_bytes&lt;/code&gt;, since PostgreSQL sometimes has to write data temporarily to disk in order to successfully execute various queries (if there is not enough memory available). High numbers in this area indicate a potentially increasing number of resource-draining queries.&lt;/p&gt;

&lt;p&gt;You’ll also want to make sure your write performance is up to snuff, so keeping tabs on &lt;code class="language-markup"&gt;tup_inserted&lt;/code&gt;, &lt;code class="language-markup"&gt;tup_updated&lt;/code&gt;, and  &lt;code class="language-markup"&gt;tup_deleted&lt;/code&gt; is crucial. High rates of updated and deleted rows could lead to a higher number of dead rows (&lt;code class="language-markup"&gt;n_dead_tup&lt;/code&gt; in the &lt;code class="language-markup"&gt;pg_stat_user_tables&lt;/code&gt; view), which is another metric to keep tabs on. Having a huge number of dead rows (rows that have already been deleted and are waiting to be cleaned out) indicates something may be wrong with the clean-up process—in PostgreSQL, this process is known as the VACUUM process. Essentially, its job is to remove dead rows from tables and indexes in order to make the space available for new row insertions. As a side note, the VACUUM process should be run on a routine basis to allow for continued query efficiency and to update PostgreSQL’s internal statistics regularly. Remember that high amounts of dead rows (essentially wasted space) can definitely slow down your queries in the long-term.&lt;/p&gt;

&lt;p&gt;If you encounter high rates of change in both read and write throughput, it makes sense to check if there are delays occurring from locks (&lt;code class="language-markup"&gt;lock&lt;/code&gt; from the &lt;code class="language-markup"&gt;pg_locks&lt;/code&gt; view) on tables or rows that are currently experiencing or awaiting updates. Related to this is the presence of any &lt;code class="language-markup"&gt;deadlocks&lt;/code&gt; in the database, which occur when several transactions hold locks on a row or table that another transaction needs in order to execute a query. It’s best avoid the occurrence of deadlocks altogether if possible by ensuring that locks are assigned in a consistent order each time.&lt;/p&gt;
&lt;h2&gt;Reliability&lt;/h2&gt;
&lt;p&gt;If your data is pretty important to you, then you’re probably keeping multiple copies of it (so you don’t lose it all in the event of a crash) and you want it to be highly available at all times. This is where the &lt;code class="language-markup"&gt;pg_stat_bgwriter&lt;/code&gt; view can make a big difference. It tracks a number of checkpoint metrics.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.postgresql.org/docs/current/static/wal-configuration.html"&gt;Checkpoints&lt;/a&gt; are periodic moments in the transaction process that ensure data files have been updated up to that moment on disk. If that sounds confusing, think of how word processors periodically auto-save the files you’re working on and if your program were to crash, upon reboot you’re brought back to that previous auto-saved version. Checkpoints operate similarly with respect to recorded and updated data files. Generally, the process of flushing the updated data to disk can cause significant I/O load, and as a result, checkpoint activity is spaced out in order to avoid a loss in performance. This means that a single checkpoint must complete before the next one can start.&lt;/p&gt;

&lt;p&gt;Compare the following two variables: &lt;code class="language-markup"&gt;checkpoints_req&lt;/code&gt; and &lt;code class="language-markup"&gt;checkpoints_timed&lt;/code&gt;. The first shows the number of checkpoints requested while the latter represents the number of checkpoints scheduled. It’s preferable to have more checkpoints scheduled than requested; the other way around could point to your checkpoints not being able to keep up with the rate of data updates and indicate heavy load on the database.&lt;/p&gt;

&lt;p&gt;The &lt;code class="language-markup"&gt;pg_stat_bgwriter&lt;/code&gt; also shows metrics on how PostgreSQL chooses to flush data in memory (buffers) to disk. It can do this in three different ways&lt;/p&gt;
&lt;ul&gt;
 	&lt;li&gt;&lt;code class="language-markup"&gt;buffers_backend&lt;/code&gt; - via backends&lt;/li&gt;
 	&lt;li&gt;&lt;code class="language-markup"&gt;buffers_clean&lt;/code&gt; - via the background writer&lt;/li&gt;
 	&lt;li&gt;&lt;code class="language-markup"&gt;buffers_checkpoint&lt;/code&gt; - via the checkpoint process&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Ideally you want most of the flushes happening via the checkpoint process, but sometimes the background writer steps in to help lighten the I/O load that often occurs in the checkpoint process. An increase in buffers written directly by backends could mean a write-intensive load that is creating buffers so fast the checkpoint process can’t keep up. Ultimately, it’s in your best interest to keep an eye on these three.&lt;/p&gt;
&lt;h2&gt;Summary&lt;/h2&gt;
&lt;p&gt;Hopefully all this information can be combined with the previous tutorial to make it super easy for you to monitor your PostgreSQL databases using Telegraf and InfluxDB. Feel free to reach out to us on Twitter &lt;a href="https://twitter.com/influxdb"&gt;@InfluxDB&lt;/a&gt; and &lt;a href="https://twitter.com/mschae16"&gt;@mschae16&lt;/a&gt; with any questions or comments or you can check out our &lt;a href="https://community.influxdata.com/"&gt;community forum&lt;/a&gt; to see what other InfluxData users are building.&lt;/p&gt;
</description>
      <pubDate>Thu, 16 Aug 2018 14:45:41 -0700</pubDate>
      <link>https://www.influxdata.com/blog/metrics-to-monitor-in-your-postgresql-database/</link>
      <guid isPermaLink="true">https://www.influxdata.com/blog/metrics-to-monitor-in-your-postgresql-database/</guid>
      <category>Product</category>
      <category>Use Cases</category>
      <category>Developer</category>
      <author>Margo Schaedel (InfluxData)</author>
    </item>
    <item>
      <title>Monitoring Your PostgreSQL Database with Telegraf and InfluxDB</title>
      <description>&lt;div style="padding:56.25% 0 0 0;position:relative;"&gt;&lt;iframe src="https://player.vimeo.com/video/676028102?h=ac9ecbdd4c&amp;amp;badge=0&amp;amp;autopause=0&amp;amp;player_id=0&amp;amp;app_id=58479" frameborder="0" allow="autoplay; fullscreen; picture-in-picture" allowfullscreen="" style="position:absolute;top:0;left:0;width:100%;height:100%;" title="Getting Data into Telegraf"&gt;&lt;/iframe&gt;&lt;/div&gt;
&lt;script src="https://player.vimeo.com/api/player.js"&gt;&lt;/script&gt;

&lt;p&gt;&lt;br /&gt;&lt;/p&gt;

&lt;h2&gt;&lt;b&gt;Overview&lt;/b&gt;&lt;/h2&gt;
&lt;p&gt;&lt;img class="wp-image-216894 alignleft" src="/images/legacy-uploads/postgresql.png" alt="" width="269" height="278" /&gt;&lt;/p&gt;

&lt;p&gt;This tutorial will specifically cover the process of setting up &lt;a href="https://w2.influxdata.com/time-series-platform/telegraf/"&gt;Telegraf&lt;/a&gt; and &lt;a href="https://w2.influxdata.com/time-series-platform/influxdb/"&gt;InfluxDB&lt;/a&gt; to monitor PostgreSQL. For any newcomers to the scene, &lt;a href="https://www.postgresql.org/"&gt;PostgreSQL&lt;/a&gt; (or just Postgres for short) is a really popular open source, object-relational database system that was originally spearheaded by developers at UC Berkeley back in 1986. It has important features like multi-version concurrency control and write-ahead logging that help to ensure data reliability. If you’re not too familiar with PostgreSQL, I’d recommend starting with their &lt;a href="http://www.postgresqltutorial.com/"&gt;beginner’s tutorial&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Recognizing the importance of tracking and monitoring performance and throughput of databases, the makers of PostgreSQL added a &lt;a href="https://www.postgresql.org/docs/current/static/monitoring-stats.html"&gt;statistics collector&lt;/a&gt; that automatically amasses information about its own database activity. You essentially have all these great metrics right out of the box. So let’s capitalize on that, expose all those metrics to Telegraf and send them on over to InfluxDB.&lt;/p&gt;

&lt;p&gt;&lt;img class="wp-image-216895 alignright" src="/images/legacy-uploads/influxdb-iwi.png" alt="" width="263" height="194" /&gt;&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;
&lt;h2&gt;&lt;/h2&gt;
&lt;h2&gt;What You'll Need&lt;/h2&gt;
&lt;p&gt;I’m using a local installation of &lt;a href="https://docs.influxdata.com/influxdb/v1.6/introduction/getting-started/"&gt;InfluxDB&lt;/a&gt;, &lt;a href="https://docs.influxdata.com/telegraf/v1.7/introduction/getting-started/"&gt;Telegraf&lt;/a&gt;, and &lt;a href="https://docs.influxdata.com/chronograf/v1.6/introduction/getting-started/"&gt;Chronograf&lt;/a&gt; for this tutorial; the “Getting Started” guides for each of those projects are great and easy to walk through.  You’ll also need &lt;a href="https://www.postgresql.org/download/"&gt;PostgreSQL&lt;/a&gt; on your machine and if you don’t happen to have any sample applications and databases lying around, you can fork/clone &lt;a href="https://github.com/mschae16/palette-picker"&gt;this repo&lt;/a&gt; down to follow along—it’s just a small Node/Express app that stores color palettes in PostgreSQL—be sure to follow the &lt;a href="https://github.com/mschae16/palette-picker/blob/master/README.md"&gt;README&lt;/a&gt; on how to get the app working.&lt;/p&gt;
&lt;h2&gt;Editing Your Telegraf Config&lt;/h2&gt;
&lt;p&gt;To start with, the &lt;a href="https://github.com/influxdata/telegraf/tree/master/plugins"&gt;Telegraf GitHub page&lt;/a&gt; offers a number of input and output plugins to suit a variety of use cases—one of those includes the &lt;a href="https://github.com/influxdata/telegraf/tree/master/plugins/inputs/postgresql"&gt;PostgreSQL input plugin&lt;/a&gt;.  If we configure this plugin correctly in our Telegraf configuration file, we should automatically start seeing metrics being sent over to our default &lt;code class="language-markup"&gt;telegraf.autogen&lt;/code&gt; database within InfluxDB.&lt;/p&gt;

&lt;p&gt;Let’s try it out.&lt;/p&gt;

&lt;p&gt;Navigate to your Telegraf config file and find the &lt;code class="language-markup"&gt;[[inputs.postgresql]]&lt;/code&gt; section. If you’re using a Mac OS and used &lt;a href="https://brew.sh/"&gt;Homebrew&lt;/a&gt; to install InfluxDB and Telegraf, this path &lt;code class="language-markup"&gt;/usr/local/etc/telegraf.conf&lt;/code&gt; should get you to the default config file. Otherwise, feel free to refer to the &lt;a href="https://docs.influxdata.com/telegraf/v1.7/"&gt;Telegraf docs&lt;/a&gt; for further reference.&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;# # Read metrics from one or many postgresql servers
# [[inputs.postgresql]]
#   ## specify address via a url matching:
#   ##   postgres://[pqgotest[:password]]@localhost[/dbname]\
#   ##       ?sslmode=[disable|verify-ca|verify-full]
#   ## or a simple string:
#   ##   host=localhost user=pqotest password=... sslmode=... dbname=app_production
#   ##
#   ## All connection parameters are optional.
#   ##
#   ## Without the dbname parameter, the driver will default to a database
#   ## with the same name as the user. This dbname is just for instantiating a
#   ## connection with the server and doesn't restrict the databases we are trying
#   ## to grab metrics for.
#   ##
#   address = "host=localhost user=postgres sslmode=disable"
#   ## A custom name for the database that will be used as the "server" tag in the
#   ## measurement output. If not specified, a default one generated from
#   ## the connection address is used.
#   # outputaddress = "db01"
#
#   ## connection configuration.
#   ## maxlifetime - specify the maximum lifetime of a connection.
#   ## default is forever (0s)
#   max_lifetime = "0s"
#
#   ## A  list of databases to explicitly ignore.  If not specified, metrics for all
#   ## databases are gathered.  Do NOT use with the 'databases' option.
#   # ignored_databases = ["postgres", "template0", "template1"]
#
#   ## A list of databases to pull metrics about. If not specified, metrics for all
#   ## databases are gathered.  Do NOT use with the 'ignored_databases' option.
#   # databases = ["app_production", "testing"]&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is what the config file looks like out of the box. As you can see the instructions to follow are fairly simple. You definitely need to specify the address to connect to so Telegraf can talk to your PostgreSQL server. You can optionally specify other parameters such as a &lt;code class="language-markup"&gt;username&lt;/code&gt;,&lt;code class="language-markup"&gt;password&lt;/code&gt;, enable or disable &lt;code class="language-markup"&gt;ssl-mode&lt;/code&gt;, and connect to a specific database if you wish.&lt;/p&gt;

&lt;p&gt;If you want to create a custom name for the server tag in your InfluxDB database, you can specify that in &lt;code class="language-markup"&gt;outputaddress&lt;/code&gt;. Connection lifetime dictates the duration you’d like the connection to remain open. Finally, you can list arrays of databases to either ignore or to collect metrics specifically for those listed. For this option you can only do one or the other, not both.&lt;/p&gt;

&lt;p&gt;This plugin makes it easy to pull metrics from the already built-in &lt;em&gt;&lt;code class="language-markup"&gt;pg_stat_database&lt;/code&gt;&lt;/em&gt; and &lt;em&gt;&lt;code class="language-markup"&gt;pg_stat_bgwriter&lt;/code&gt;&lt;/em&gt; views within postgresql. Check out the &lt;a href="https://www.postgresql.org/docs/9.2/static/monitoring-stats.html#PG-STAT-DATABASE-VIEW"&gt;docs&lt;/a&gt; to see exactly what metrics are pulled.  Let’s change the address value to a string listing our host as localhost, like so:&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;address = "host=localhost"&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The only other thing to ensure is that your data output will be sent to InfluxDB. If you scroll down to the &lt;code class="language-markup"&gt;outputs.influxdb&lt;/code&gt; section, you can edit the url to include InfluxDB’s default port 8086:&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;# Configuration for influxdb server to send metrics to
[[outputs.influxdb]]
  ## The full HTTP or UDP URL for your InfluxDB instance.
  ##
  ## Multiple urls can be specified as part of the same cluster,
  ## this means that only ONE of the urls will be written to each interval.
  # urls = ["udp://localhost:8089"] # UDP endpoint example
  urls = ["http://localhost:8086"] # required
  ## The target database for metrics (telegraf will create it if not exists).
  database = "telegraf" # required

  ## Name of existing retention policy to write to.  Empty string writes to
  ## the default retention policy.
  retention_policy = ""
  ## Write consistency (clusters only), can be: "any", "one", "quorum", "all"
  write_consistency = "any"

  ## Write timeout (for the InfluxDB client), formatted as a string.
  ## If not provided, will default to 5s. 0s means no timeout (not recommended).
  timeout = "5s"
  # username = "telegraf"
  # password = "metricsmetricsmetricsmetrics"
  ## Set the user agent for HTTP POSTs (can be useful for log differentiation)
  # user_agent = "telegraf"
  ## Set UDP payload size, defaults to InfluxDB UDP Client default (512 bytes)
  # udp_payload = 512

  ## Optional SSL Config
  # ssl_ca = "/etc/telegraf/ca.pem"
  # ssl_cert = "/etc/telegraf/cert.pem"
  # ssl_key = "/etc/telegraf/key.pem"
  ## Use SSL but skip chain &amp;amp; host verification
  # insecure_skip_verify = false

  ## HTTP Proxy Config
  # http_proxy = "http://corporate.proxy:3128"

  ## Optional HTTP headers
  # http_headers = {"X-Special-Header" = "Special-Value"}

  ## Compress each HTTP request payload using GZIP.
  # content_encoding = "gzip"&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Restart Telegraf and Chronograf, navigate to Chronograf’s default port (8888) and in the Data Explorer section of the menu, you should see a measurement called &lt;code class="language-markup"&gt;postgresql&lt;/code&gt; under the default &lt;code class="language-markup"&gt;telegraf.autogen&lt;/code&gt; database. You should also see a plethora of metrics in the field column, including &lt;code class="language-markup"&gt;blk_read_time&lt;/code&gt;, &lt;code class="language-markup"&gt;blk_write_time&lt;/code&gt;, &lt;code class="language-markup"&gt;buffers_clean&lt;/code&gt;, &lt;code class="language-markup"&gt;datid&lt;/code&gt;, &lt;code class="language-markup"&gt;deadlocks&lt;/code&gt;, &lt;code class="language-markup"&gt;tup_inserted&lt;/code&gt;, and &lt;code class="language-markup"&gt;tup_deleted&lt;/code&gt;, just to name a few. To read up on what each of those fields means exactly, check out &lt;a href="https://www.postgresql.org/docs/9.2/static/monitoring-stats.html#PG-STAT-DATABASE-VIEW"&gt;this reference page&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;img class="alignnone size-full wp-image-216897" src="/images/legacy-uploads/monitoring-postgresql-with-telegraf-influxdb.png" alt="" width="1600" height="453" /&gt;&lt;/p&gt;

&lt;p&gt;Alternatively, you can query the data from InfluxDB, using the CLI. In your terminal, type &lt;code class="language-markup"&gt;influx&lt;/code&gt; to access the Influx shell. The command, &lt;code class="language-markup"&gt;SHOW DATABASES&lt;/code&gt; will list the databases out for you, &lt;code class="language-markup"&gt;USE [databasename]&lt;/code&gt; and then &lt;code class="language-markup"&gt;SHOW MEASUREMENTS&lt;/code&gt; will list out the measurement names associated with that particular database. Then you can run various query statements such as&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;SELECT mean("xact_commit") AS "mean_xact_commit" FROM "telegraf"."autogen"."postgresql" WHERE time &amp;gt; now() - 5m AND "db"='palette_picker'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;or&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;SELECT * FROM "telegraf"."autogen"."postgresql" WHERE time &amp;gt; now() - 1m AND "db"='palette_picker'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Try it out and see for yourself! If you get too query-happy and need to kill a query at any time, just run &lt;code class="language-markup"&gt;KILL QUERY [qid]&lt;/code&gt; which can be found using the &lt;code class="language-markup"&gt;SHOW QUERIES&lt;/code&gt; command.&lt;/p&gt;
&lt;h2&gt;Monitoring PostgreSQL in Production&lt;/h2&gt;
&lt;p&gt;If you want to keep tabs on your PostgreSQL databases while in production, it’s easy-peasy. Just update the telegraf config file with the correct address information.  I’ve updated the address in my telegraf config file below to monitor Postgresql from my Heroku instance of this same sample app (Palette Picker). I was able to find all these credentials on my Heroku dashboard page. Check it out:&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;address = "host=ec2-204-236-239-225.compute-1.amazonaws.com user=username password=password dbname=databasename"&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;(The username, password, and dbname have been changed here for security purposes)&lt;/p&gt;

&lt;p&gt;Pretty simple, right?&lt;/p&gt;
&lt;h2&gt;Next Steps&lt;/h2&gt;
&lt;p&gt;Hopefully this guide has helped show just how easy it is to monitor your PostgreSQL databases using Telegraf and InfluxDB. Next post, we’ll talk about some of the key metrics to keep an eye on when evaluating the health of your Postgres database. Feel free to reach out to us on Twitter &lt;a href="https://twitter.com/influxdb"&gt;@influxDB&lt;/a&gt; and &lt;a href="https://twitter.com/mschae16"&gt;@mschae16&lt;/a&gt; with any questions or comments!&lt;/p&gt;
</description>
      <pubDate>Fri, 20 Jul 2018 10:40:24 -0700</pubDate>
      <link>https://www.influxdata.com/blog/monitoring-your-postgresql-database-with-telegraf-and-influxdb/</link>
      <guid isPermaLink="true">https://www.influxdata.com/blog/monitoring-your-postgresql-database-with-telegraf-and-influxdb/</guid>
      <category>Product</category>
      <category>Use Cases</category>
      <category>Developer</category>
      <author>Margo Schaedel (InfluxData)</author>
    </item>
    <item>
      <title>Simplifying InfluxDB: Retention Policy Best Practices</title>
      <description>&lt;p&gt;Retention policies can often be tricky even at the best of times but when you’re dealing with time series data, setting up the appropriate retention policy to automatically expire (delete) unnecessary data can save you loads of time in the long run. This post will walk through some general guidelines on creating the best retention policy for your use case with InfluxDB.&lt;/p&gt;

&lt;h3&gt;Wait...What's a Retention Policy?&lt;/h3&gt;

&lt;p&gt;&lt;img src="/images/legacy-uploads/simplifying-influxdb-retention-policies-1.png" alt="influxdb retention policies" width="401" height="268" /&gt;&lt;/p&gt;
&lt;figcaption&gt; Data doesn't remain useful forever.&lt;/figcaption&gt;

&lt;p&gt;Before we start talking about best practices around retention policies, it’s important to understand just what they are. Although its name is somewhat explanatory, an InfluxDB retention policy is defined in the &lt;a href="https://docs.influxdata.com/influxdb/v1.5/concepts/glossary/#retention-policy-rp"&gt;documentation&lt;/a&gt; as:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;strong&gt;The part of InfluxDB’s data structure that describes for how long InfluxDB keeps data (duration), how many copies of those data are stored in the cluster (replication factor), and the time range covered by shard groups (shard group duration). RPs are unique per database and along with the measurement and tag set define a series.&lt;/strong&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;strong&gt;When you create a database, InfluxDB automatically creates a retention policy called autogen with an infinite duration, a replication factor set to one, and a shard group duration set to seven days.&lt;/strong&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;A retention policy dictates for how long data will be kept and stored. Because time series data accumulates rapidly, best practice discards or downsamples data from InfluxDB once it’s no longer as relevant. Because &lt;a href="https://www.influxdata.com/blog/what-is-time-series-data-and-why-should-you-care/"&gt;time series data&lt;/a&gt; tends to pile up really quickly, you’re definitely going to want to discard or downsample data from InfluxDB once it’s no longer as useful. If you need further convincing, just check out these blog posts:&lt;/p&gt;
&lt;ul&gt;
 	&lt;li&gt;&lt;a href="https://www.influxdata.com/blog/optimizing-data-queries-for-time-series-applications/"&gt;Optimizing Data Queries for Time Series Applications&lt;/a&gt;&lt;/li&gt;
 	&lt;li&gt;&lt;a href="https://www.influxdata.com/blog/influxdb-shards-retention-policies/"&gt;Simplifying InfluxDB: Shards and Retention Policies&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;General Guidelines&lt;/h3&gt;

&lt;p&gt;There are a few key things to consider when you’re setting up your database’s retention policy. First and foremost, you’ll need to consider how long your use case requires that you retain the data. Do you need it for a week? A month? A year? This decision will specifically guide to what amount of time you set your retention policy duration and isn’t really negotiable.&lt;/p&gt;

&lt;p&gt;But wait - you’re not done yet. Another integral part of setting up a retention policy involves designating the shard group duration for all data that will follow this retention policy. This is where things get tricky. Since shards really represent the core physical part of the database, tuning the shard group duration to just the right setting can really maximize performance and so, it’s important to get it right.&lt;/p&gt;

&lt;p&gt;Setting the duration on the higher side will result in larger collections of data within each shard. This could cause problems when querying the database. For example, if you’re querying the database for a shorter time window than the shard group time span, the database may need to decode longer blocks of data in order to read a subset of the time range of the shard and that process will require greater effort and time.&lt;/p&gt;

&lt;p&gt;On the other hand, if you set the shard group duration on the shorter side, the result is a greater number of shard groups. Due to &lt;a href="https://docs.influxdata.com/influxdb/v1.5/concepts/tsi-details/"&gt;Time Series Indexing&lt;/a&gt;, each shard will have some extra overhead in the form of this index and metadata, so having thousands of shards with little data on each is by no means efficient.&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/legacy-uploads/simplifying-influxdb-retention-policies-2.png" alt="retention policies" width="383" height="256" /&gt;&lt;/p&gt;
&lt;figcaption&gt; It can sometimes be difficult to determine the right setting for your shard group duration.&lt;/figcaption&gt;

&lt;p&gt;My recommendation is to be like Goldilocks and try them all out until you hit the perfect spot!&lt;/p&gt;

&lt;p&gt;Okay, all joking aside - we at InfluxData recommend setting the shard group duration as follows:&lt;/p&gt;
&lt;ul&gt;
 	&lt;li&gt;The shard group duration should be twice your longest typical query's time range - yep, that means you'll need to think about what kinds of queries you'll be running on InfluxDB.&lt;/li&gt;
 	&lt;li&gt;The shard group duration should be set so that each shard group ends up with at least 100,000 points per group - you want more data per shard, but not too much data.&lt;/li&gt;
 	&lt;li&gt;The shard group duration should be set so that each shard group has at least 1,000 points per series.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;Summary&lt;/h3&gt;

&lt;p&gt;If you’re new to using InfluxDB, setting up your database schema and retention policies can sometimes feel like a daunting task. Especially in more exceptional cases like working with very large clusters (Influx Enterprise) or with very long or short retention periods. You’ll definitely want to spend some time tweaking retention duration and shard group duration until you find the right fit. After all, it took Goldilocks three tries, right? Once you find that setting that’s just right, tweet us @InfluxDB and @mschae16 and tell us all about it!&lt;/p&gt;
</description>
      <pubDate>Wed, 20 Jun 2018 09:00:42 -0700</pubDate>
      <link>https://www.influxdata.com/blog/simplifying-influxdb-retention-policy-best-practices/</link>
      <guid isPermaLink="true">https://www.influxdata.com/blog/simplifying-influxdb-retention-policy-best-practices/</guid>
      <category>Product</category>
      <category>Use Cases</category>
      <category>Developer</category>
      <category>Company</category>
      <author>Margo Schaedel (InfluxData)</author>
    </item>
    <item>
      <title>Simplifying InfluxDB: Shards and Retention Policies</title>
      <description>&lt;p&gt;&lt;img src="/images/legacy-uploads/marcus-lofvenberg-451617-unsplash-1024x683.jpg" alt="floating ice chunks" width="407" height="271" /&gt;&lt;/p&gt;

&lt;p&gt;I recently did a webinar on an Introduction to &lt;a href="https://www.influxdata.com/time-series-platform/influxdb/"&gt;InfluxDB&lt;/a&gt; and &lt;a href="https://www.influxdata.com/time-series-platform/telegraf/"&gt;Telegraf&lt;/a&gt; and in preparing for it, I came to the woeful realization that there are still a number of concepts about InfluxDB which remain quite mysterious to me. Now, if you’re anything like me, databases and data storage don’t come naturally. (If they do, it never hurts to have a bit of review). I thought I had a pretty thorough understanding of InfluxDB as a time series data store, but now I see that there’s a lot more to it than meets the eye. Coincidentally, the inner workings of InfluxDB are pretty mysterious to some of our community as well and thus this blog post. In this guide (of sorts), we’ll try to make sense of some of the more enigmatic concepts around InfluxDB - specifically with regards to retention policies, shard groups, and shards. We’ll look at what they are and how they’re related to one another.&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/legacy-uploads/aron-visuals-322314-unsplash-1024x683.jpg" alt="hourglass" width="407" height="271" /&gt;&lt;/p&gt;
&lt;figcaption&gt; It's about time we paid attention to TSDBs.&lt;/figcaption&gt;

&lt;h4&gt;Before We Jump In&lt;/h4&gt;

&lt;p&gt;If you’re new to the Time Series Database world, or if this is the first time you’re reading about InfluxDB, you may want to do a little light reading and gain some contextual knowledge. Here are some helpful resources to get you up to speed:&lt;/p&gt;
&lt;ul&gt;
 	&lt;li&gt;&lt;a href="https://www.influxdata.com/blog/what-is-time-series-data-and-why-should-you-care/"&gt;What is Time Series Data and Why Should You Care?&lt;/a&gt;&lt;/li&gt;
 	&lt;li&gt;&lt;a href="https://www.influxdata.com/blog/why-should-i-use-a-time-series-database/"&gt;Why Should I Use a Time Series Database?&lt;/a&gt;&lt;/li&gt;
 	&lt;li&gt;&lt;a href="https://www.influxdata.com/blog/optimizing-data-queries-for-time-series-applications/"&gt;Optimizing Data Queries for Time Series Applications&lt;/a&gt;&lt;/li&gt;
 	&lt;li&gt;&lt;a href="https://docs.influxdata.com/influxdb/v1.5/introduction/getting-started/"&gt;Getting Started with InfluxDB&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Retention Policies&lt;/h4&gt;

&lt;p&gt;Let’s tackle retention policies first. Time series data by nature begins to pile up pretty quickly and it can be helpful to discard old data after it’s no longer useful. Retention policies offer a simple and effective way to achieve this. It amounts to what is essentially an expiration date on your data. Once the data is “expired” it will automatically be dropped from the database, an action commonly referred to as &lt;em&gt;retention policy enforcement&lt;/em&gt;. When it comes time to drop that data however, InfluxDB doesn’t just drop one data point at a time; it drops an entire &lt;em&gt;shard group&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/legacy-uploads/ankush-minda-545239-unsplash-1024x683.jpg" alt="Balloons floating into the wind" width="420" height="280" /&gt;&lt;/p&gt;
&lt;figcaption&gt; Retention policies drop entire groups of data, not just a single data point.&lt;/figcaption&gt;

&lt;h4&gt;Shard Groups&lt;/h4&gt;

&lt;p&gt;A shard group is a container for &lt;em&gt;&lt;strong&gt;shards&lt;/strong&gt;&lt;/em&gt;, which in turn contain the actual time series data (but more on that in a minute). Every shard group has a corresponding retention policy and any shards within a single shard group adhere to the same retention policy. Additionally, every shard group has a &lt;em&gt;&lt;strong&gt;shard group duration&lt;/strong&gt;&lt;/em&gt;, which dictates the window of time each shard group spans (the time interval). The time interval can be specified when configuring the retention policy. If nothing is specified, the shard group duration defaults to 7 days.&lt;/p&gt;

&lt;h4&gt;Shards&lt;/h4&gt;

&lt;p&gt;When we think about a typical Time Series Database, the sheer volume of time series data that is stored and queried within merits an alternative approach to the categorization of that data. This is where shards come in. Shards are ideal containers for time series data. Sharding the data within InfluxDB allows for a highly scalable approach for boosting throughput and overall performance, especially considering that the data in a Time Series Database will in all likelihood grow over time.&lt;/p&gt;

&lt;p&gt;Shards contain temporal blocks of data and are mapped to an underlying storage engine database. The InfluxDB storage engine is called TSM or Time-Structured Merge Tree and is remarkably similar to an LSM Tree. The TSM files are what contains the encoded and compressed time series data, organized within shards.&lt;/p&gt;

&lt;p&gt;All shards belong to a single shard group, and their time intervals fall within the shard group’s time interval. It’s quite possible to have a single shard per shard group, as we see in the open-source version of InfluxDB, or multiple shards per shard group as often occurs in a multi-node cluster.&lt;/p&gt;

&lt;h4&gt;Looping Back to RPs&lt;/h4&gt;

&lt;p&gt;Looping back to retention policies for a moment, let’s take a closer look at how things fit together. When you create a database in InfluxDB, you automatically create a default retention policy for that database called autogen. If you choose not to modify the default policy, the value is set to infinite. In this case, the shard group duration will default to 7 days. This means that your data will be stored in 1 week time windows. If your retention policy is on autogen (or infinite), the data is not actually stored infinitely - this just means the retention policy matches the shard group duration, so the retention policy is effectively disabled. On the other hand, the minimum time you can set your retention policy to is one hour.&lt;/p&gt;

&lt;p&gt;Another way to think about it is that a retention policy is like a bucket for shard groups to live in. Once the retention policy expiration date kicks in, you throw out the shard group that has the interval of time that doesn’t pass the retention policy expiry date. So even as time passes, you’ll still have the same amount of data available to you - it will just shift in time. For example, if I set my retention policy to one year, I’ll always have a year’s worth of data available to me (once I hit that first year mark).&lt;/p&gt;

&lt;p&gt;As you can see, shard groups (and by association, shards) are closely related to retention policies; if a retention policy has data, it will have at least one shard group. Every data point, which is a measurement consisting of any number of values and tags associated with a particular point in time, must be associated with a database and a retention policy. It’s important to remember here that a database can have more than one retention policy and that all retention policies are unique per database.&lt;/p&gt;

&lt;h4&gt;OSS vs. Enterprise&lt;/h4&gt;

&lt;p&gt;Things get a little hairy when we start looking at shards, shard groups, and retention policies in InfluxDB Enterprise as compared to the open-source version of InfluxDB. If we’re using the open-source version, we’ve only got a single node instance of InfluxDB, and this means we don’t need to worry about replicating our data because that feature isn’t available. So the shard group ends up having only one shard within it, effectively making them the same thing (another way to think about it is that the shard becomes redundant). This is because you don’t need to spread the data evenly across multiple nodes—you’ve only got one node! When the retention policy kicks in, you drop the whole shard group.&lt;/p&gt;

&lt;p&gt;With InfluxDB Enterprise, on the other hand, you can have multiple node instances of InfluxDB. If you want to know more about this clustering capability, I recommend reading this &lt;a href="https://www.influxdata.com/blog/understanding-influxenterprise-what-is-a-cluster/"&gt;blog&lt;/a&gt;, which covers the basics. Having more than one node in a cluster is the reason shard groups exist. We needed a way to spread the data evenly across multiple nodes, while still belonging to the appropriate database, retention policy, and time interval. In Enterprise, a shard group can have (and usually does have) a set of shards within it that all share the same time span. Each shard in the shard group would contain a different subset of time series.&lt;/p&gt;

&lt;p&gt;We also see &lt;strong&gt;&lt;em&gt;replication factor&lt;/em&gt; &lt;/strong&gt;come into play with the Enterprise version of InfluxDB. The replication factor represents the number of copies you want to make of the data. You can specify the replication factor in the database retention policy. Two copies of the same data cannot end up in the same shard group. They would ideally live in separate shard groups and on separate nodes. That way, if one node goes down, you still have a backup on another node.&lt;/p&gt;

&lt;h4&gt;Seeing It in Action&lt;/h4&gt;

&lt;p&gt;To help this sink in, let’s consider all of this with a few examples:&lt;/p&gt;

&lt;p&gt;For the open-source version, remember we’ve only got one node instance, so a shard group would have only one shard within, like so:&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;Data Points
---------------
series_a t0
series_a t4

series_b t2
series_b t6

series_c t3
series_c t8

series_d t7
series_d t9

Shard Group Z (t0 - t10)
-------------
Shard 1 (series_a, series_b, series_c, series_d)&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;From the simplified example above, you see we have a shard group (Z) with a time span from t0 to t10 and several series subsets (a, b, c, and d). Because we don’t have to worry about distribution here (spreading the data evenly across various nodes), all the series are contained within one shard (Shard 1).&lt;/p&gt;

&lt;p&gt;For the Enterprise version, we can have more than one node, so things get a little more complicated. If we had a two-node cluster, for example, with a replication factor of 1:&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;Data Points
---------------
series_a t0
series_a t4

series_b t2
series_b t6

series_c t3
series_c t8

series_d t7
series_d t9

Shard Group Z (t0 - t10)
-------------
Shard 1 (series_a, series_c) (Node A)
Shard 2 (series_b, series_d) (Node B)&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;You can see we still have shard group Z with a time span from t0 to t10, but this shard group contains two shards. Because replication factor is only 1 (i.e. only 1 copy of data), distribution takes priority and so half the data is stored on Node A and the other half is stored on Node B. This evenly spreads the data across the two nodes and lessens possibility of performance issues. However, if we increase the replication factor to 2, the replication takes precedence over distribution and the outcome looks quite similar to the open-source example. See below:&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;Data Points
---------------
series_a t0
series_a t4

series_b t2
series_b t6

series_c t3
series_c t8

series_d t7
series_d t9

Shard Group Z (t0 - t10)
-------------
Shard 1 (series_a, series_b, series_c, series_d) (Node A, Node B)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now we’re back to one shard within one shard group (Z), but it exists on both nodes, due to the replication factor. Let’s add retention policy to the mix now.&lt;/p&gt;

&lt;p&gt;Let’s say we’ve got our database all set up with a retention policy of 1 day (24hrs) and our shard group duration set to the recommended 1 hour time interval. If this is the OSS version of InfluxDB, the shard group will contain one shard. That shard will house all series for the 1 hour time span similar to what we saw in our first example:&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;Shard Group Z (t0 - t60)
-------------
Shard 1 (series_a, series_b, series_c, series_d)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Of course for every hour in the day, a new shard group will be created spanning 60 minutes and the number of shard groups will continue to increase until we hit the 25th hour (after 1 full day passes). When the retention policy is enforced, we will see that the initial shard group has passed the expiration point, and so the entire shard group will be dropped. This will continue on the hour, every hour. So at any given time, we will have precisely 1 day’s worth of data.&lt;/p&gt;

&lt;p&gt;&lt;img src="/images/legacy-uploads/diego-ph-249471-unsplash-819x1024.jpg" alt="Hand holding a lightbulb" width="330" height="413" /&gt;&lt;/p&gt;

&lt;h4&gt;Making Sense of It All&lt;/h4&gt;

&lt;p&gt;To summarize:&lt;/p&gt;
&lt;ul&gt;
 	&lt;li&gt;An InfluxDB instance can have 1 or more databases.&lt;/li&gt;
 	&lt;li&gt;Each of those databases can have 1 or more retention policies.&lt;/li&gt;
 	&lt;li&gt;You can specify the retention interval, shard group duration, and replication factor in your retention policy.&lt;/li&gt;
 	&lt;li&gt;Each retention policy can have 1 or more shard groups (as long as there's data).&lt;/li&gt;
 	&lt;li&gt;Each shard group can have 1 or more shards (always 1 shard for the OSS version).&lt;/li&gt;
 	&lt;li&gt;Shards contain the actual data.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I hope this post has helped to clear things up a little, but if you’re still feeling confused (trust me, I know the feeling well), please reach out to us on Twitter @influxDB and @mschae16 and we can try to answer all your questions. Or check out our awesome community site where everyone comes together to help each other out with debugging and making sense of the magical and oft-times mysterious InfluxData platform.&lt;/p&gt;
</description>
      <pubDate>Tue, 05 Jun 2018 08:00:10 -0700</pubDate>
      <link>https://www.influxdata.com/blog/influxdb-shards-retention-policies/</link>
      <guid isPermaLink="true">https://www.influxdata.com/blog/influxdb-shards-retention-policies/</guid>
      <category>Product</category>
      <category>Use Cases</category>
      <category>Developer</category>
      <author>Margo Schaedel (InfluxData)</author>
    </item>
    <item>
      <title>Visualizing Your Time Series Data with the Highcharts Library</title>
      <description>&lt;p&gt;&lt;img class="aligncenter size-full wp-image-215072" src="/images/legacy-uploads/hc-logo.jpg" alt="Image of Highcharts logo" width="720" height="240" /&gt;&lt;/p&gt;

&lt;p&gt;There have been a couple of posts in the past on visualizing your time series data using different charting libraries such as this &lt;a href="https://w2.influxdata.com/blog/data-visualizations-with-influxdb-integrating-plotly-js/"&gt;integration with plotly.js&lt;/a&gt; or this &lt;a href="https://w2.influxdata.com/blog/visualizing-your-time-series-data-from-influxdb-with-rickshaw/"&gt;one on the Rickshaw library&lt;/a&gt;. Today we’re going to take a look at the charting library, &lt;a href="https://www.highcharts.com/"&gt;Highcharts&lt;/a&gt;—another great tool for your data visualization needs. Of course, if you don’t want to pull in external graphing libraries, you can always check out Grafana or Chronograf. Grafana easily integrates with InfluxDB, and Chronograf was built out specifically &lt;em&gt;to be used&lt;/em&gt; with InfluxDB.&lt;/p&gt;

&lt;p&gt;&lt;img class="size-full wp-image-215073" src="/images/legacy-uploads/mascot-influxdb.png" alt="Image of InfluxDB I'iwi" width="400" height="295" /&gt;&amp;lt;figcaption&amp;gt; Our famed InfluxDB I’iwi&amp;lt;/figcaption&amp;gt;&lt;/p&gt;

&lt;p&gt;Before we start throwing those graphs on the page though, you’ll need to ensure you have an instance of InfluxDB up and running. You can get all the components of the &lt;a href="https://w2.influxdata.com/time-series-platform/"&gt;TICK Stack&lt;/a&gt; set up locally or spin up the stack in our handy &lt;a href="https://github.com/influxdata/sandbox"&gt;sandbox&lt;/a&gt; mode.&lt;/p&gt;

&lt;p&gt;&lt;img class="aligncenter size-full wp-image-215074" src="/images/legacy-uploads/node-influx-logo.png" alt="node-influx logo" width="300" height="80" /&gt;&lt;/p&gt;

&lt;p&gt;I recently published a &lt;a href="https://w2.influxdata.com/blog/getting-started-with-node-influx/"&gt;beginner’s guide&lt;/a&gt; on the &lt;a href="https://node-influx.github.io/"&gt;Node-influx client library &lt;/a&gt;as an option for integrating with InfluxDB without necessarily having to use &lt;a href="https://w2.influxdata.com/time-series-platform/#telegraf"&gt;Telegraf&lt;/a&gt; to collect your data. This visualization is built out using the same ocean tide data from that post. You can clone the repo down &lt;a href="https://github.com/mschae16/node-influx-sample"&gt;here&lt;/a&gt; if you want to check out the end product.&lt;/p&gt;
&lt;h2&gt;First Steps&lt;/h2&gt;
&lt;p&gt;Pulling in the library is our first step. I added the following &lt;code class="language-markup"&gt;script&lt;/code&gt; tag to the &lt;code class="language-markup"&gt;head&lt;/code&gt; section of the index.html file.&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;&amp;lt;script src="https://code.highcharts.com/highcharts.js"&amp;gt;&amp;lt;/script&amp;gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;To the &lt;code class="language-markup"&gt;body&lt;/code&gt; of the index.html file, you’ll need a container &lt;code class="language-markup"&gt;div&lt;/code&gt; with an &lt;code class="language-markup"&gt;id&lt;/code&gt; of ‘container’ so we can later target that in the script file, like so:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;&amp;lt;div id="container"&amp;gt;&amp;lt;/div&amp;gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The Highcharts graph will be rendered within this container.&lt;/p&gt;

&lt;p&gt;In our server file we’ve already set up an endpoint to query the data from our ocean tides database (see below) so we’ll need to fetch the data in our script file and set it into our Highcharts constructor function.&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;app.get('/api/v1/tide/:place', (request, response) =&amp;gt; {
  const { place } = request.params;
  influx.query(`
    select * from tide
    where location =~ /(?i)(${place})/
  `)
  .then( result =&amp;gt; response.status(200).json(result) )
  .catch( error =&amp;gt; response.status(500).json({ error }) );
});&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In the script file, I wrote a simple fetch function that retrieves the data based on the location name passed in.&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;const fetchData = (place) =&amp;gt; {
  return fetch(`/api/v1/tide/${place}`)
    .then(res =&amp;gt; {
      if (res.status !== 200) {
        console.log(res);
      }
      return res;
    })
    .then(res =&amp;gt; res.json())
    .catch(error =&amp;gt; console.log(error));
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;To fetch all the data for the four different locations, I used &lt;code class="language-markup"&gt;Promise.all()&lt;/code&gt; and then mutated the results to fit into the required format referenced in the &lt;a href="https://api.highcharts.com/highcharts/data" target="_blank" rel="noopener"&gt;Highcharts documentation&lt;/a&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;return Promise.all([
            fetchData('hilo'),
            fetchData('hanalei'),
            fetchData('honolulu'),
            fetchData('kahului')
         ])
        .then(parsedRes =&amp;gt; {
          const mutatedArray = parsedRes.map( arr =&amp;gt; {
            return Object.assign({}, {
              name: arr[0].location,
              data: arr.map( obj =&amp;gt; Object.assign({}, {
                x: (moment(obj.time).unix())*1000,
                y:obj.height
              }))
            });
          });
        })
        .catch(error =&amp;gt; console.log(error));&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now that we have our data ready to go, we can construct our graph.&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;Highcharts.chart('container', {
            colors: ['#508991', '#175456', '#09BC8A', '#78CAD2'],
            chart: {
              backgroundColor: {
                  linearGradient: [0, 600, 0, 0],
                  stops: [
                    [0, 'rgb(255, 255, 255)'],
                    [1, 'rgb(161, 210, 206)']
                  ]
              },
              type: 'spline'
            },
            title: {
              text: 'Hawaii Ocean Tides',
              style: {
                'color': '#175456',
              }
            },
            xAxis: {
              type: 'datetime'
            },
            yAxis: {
              title: {
                text: 'Height (ft)'
              }
            },
            plotOptions: {
              series: {
                turboThreshold: 2000,
              }
            },
            series: mutatedArray
          });&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;There’s definitely a lot going on here. The Highcharts library comes with the method &lt;a href="https://api.highcharts.com/highcharts/"&gt;chart()&lt;/a&gt; which accepts two arguments: the target element within which to render the chart and an options object within which you can specify various properties such as style, title, legend, series, type, plotOptions, and so on. Let’s go through each of the options one by one.&lt;/p&gt;
&lt;ul&gt;
 	&lt;li&gt;&lt;a href="https://api.highcharts.com/highcharts/colors"&gt;colors: [array]&lt;/a&gt; - The colors property accepts an array of hex codes which will represent the default color scheme for the chart. If all colors are used up, any new colors needed will result in the array being looped through again.&lt;/li&gt;
 	&lt;li&gt;&lt;a href="https://api.highcharts.com/highcharts/chart"&gt;chart: {object}&lt;/a&gt; - The chart property accepts an object with various additional properties including type, zoomtype, animation, events, description and a number of style properties. In this instance, I've given the background a linear gradient and designated the type as &lt;a href="https://api.highcharts.com/highcharts/plotOptions.spline"&gt;spline&lt;/a&gt;.&lt;/li&gt;
 	&lt;li&gt;&lt;a href="https://api.highcharts.com/highcharts/title"&gt;title: {object}&lt;/a&gt; - This represents the chart's main title and can be additionally given a style object to jazz things up a bit.&lt;/li&gt;
 	&lt;li&gt;&lt;a href="https://api.highcharts.com/highcharts/xAxis"&gt;xAxis: {object}&lt;/a&gt; - In this scenario, because I'm using time series data, I know the x-axis will always be time so I can designate the type as 'datetime' and the scale will automatically adjust to the appropriate time unit. However, there are numerous other options here including styling, labels, custom tick placement, and logarithm or linear type.&lt;/li&gt;
 	&lt;li&gt;&lt;a href="https://api.highcharts.com/highcharts/yAxis"&gt;yAxis: {object}&lt;/a&gt; - Similar to the xAxis property, the y-axis takes an object and has access to a number of options to customize the design and style of the chart's y-axis. I've only specified y-axis title in this case, and deferred to Highcharts automatic tick placement.&lt;/li&gt;
 	&lt;li&gt;&lt;a href="https://api.highcharts.com/highcharts/plotOptions"&gt;plotOptions: {object}&lt;/a&gt; - The plotOptions property is a wrapper object for config objects for each series type. The config objects for each series can also be overridden for an individual series item as given in the series array. Here I've used the plotOptions.series property to override the default turboThreshold of 1000 and change it to 2000. This allows for charting a greater number of data points (over the default of 1000). According to the docs, conf options for the series are accessed at three different levels. If you want to target all series in a chart, you would use the &lt;a href="https://api.highcharts.com/highcharts/plotOptions.series"&gt;plotOptions.series&lt;/a&gt; object. For series of a specific type, you would access the plotOptions of that type. For instance, to target the plotOptions for a chart type of 'line' you would access the &lt;a href="https://api.highcharts.com/highcharts/plotOptions.line"&gt;plotOptions.line&lt;/a&gt; object. Lastly, options for a specific series are given in the series property (see next bullet point).&lt;/li&gt;
 	&lt;li&gt;&lt;a href="https://api.highcharts.com/highcharts/series"&gt;series: [array] or {object}&lt;/a&gt; - This is where you'll pass in your data. You can additionally define the type for the data to be passed in, give it a name, and define additional plotOptions for it.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Check out the result!&lt;/p&gt;

&lt;p&gt;&lt;img class="size-large wp-image-215068" src="/images/legacy-uploads/Screen-Shot-2018-04-18-at-5.23.13-PM-1024x615.png" alt="Screenshot of Highcharts ocean tides graph" width="1024" height="615" /&gt;&amp;lt;figcaption&amp;gt; How wavy! (Get it? - You know, because of the ocean… and tides.)&amp;lt;/figcaption&amp;gt;&lt;/p&gt;

&lt;p&gt;This information really just covers the tip of the iceberg. The possibilities seem endless in terms of what you can create using the Highcharts graphing library. Why not take a look at their &lt;a href="https://api.highcharts.com/highcharts/"&gt;documentation&lt;/a&gt; or &lt;a href="https://www.highcharts.com/demo"&gt;demos&lt;/a&gt; and let us know all about your new creations with InfluxDB and Highcharts? Questions and comments? You can always reach out to us on Twitter: &lt;a href="https://twitter.com/mschae16"&gt;@mschae16&lt;/a&gt; or &lt;a href="https://twitter.com/influxDB"&gt;@influxDB&lt;/a&gt;.  Happy charting!&lt;/p&gt;
</description>
      <pubDate>Thu, 19 Apr 2018 13:00:12 -0700</pubDate>
      <link>https://www.influxdata.com/blog/visualizing-data-with-highcharts/</link>
      <guid isPermaLink="true">https://www.influxdata.com/blog/visualizing-data-with-highcharts/</guid>
      <category>Product</category>
      <category>Use Cases</category>
      <category>Developer</category>
      <author>Margo Schaedel (InfluxData)</author>
    </item>
    <item>
      <title>Getting Started with the Node-Influx Client Library</title>
      <description>&lt;p&gt;&lt;img src="/images/legacy-uploads/long-road-ahead.jpg" alt="image of long windy road" width="960" height="638" /&gt;&lt;/p&gt;
&lt;figcaption&gt; Embark on a new journey with node-influx!&lt;/figcaption&gt;

&lt;p&gt;When in doubt, start at the beginning—an adage that applies to any learning journey, including getting started with the node-influx client library. Let’s take a look at the &lt;a href="https://docs.influxdata.com/influxdb/v1.5/tools/api_client_libraries/"&gt;InfluxDB client libraries&lt;/a&gt; in particular,&lt;a href="https://github.com/node-influx/node-influx"&gt;node-influx&lt;/a&gt;, an InfluxDB client for JavaScript users. This client library features a simple API for most InfluxDB operations and is fully supported in Node and the browser, all without needing any extra dependencies.&lt;/p&gt;

&lt;p&gt;There’s a great &lt;a href="https://node-influx.github.io/manual/tutorial.html"&gt;tutorial&lt;/a&gt; for the node-influx library available online as well as some handy &lt;a href="https://node-influx.github.io/"&gt;documentation&lt;/a&gt;, which I recommend reading through beforehand. Here, we will just cover a few of the basics.&lt;/p&gt;

&lt;h2&gt;What You'll Need&lt;/h2&gt;

&lt;p&gt;For this tutorial, I’ll be running a local installation of InfluxDB; you can learn how to get that up and running &lt;a href="https://docs.influxdata.com/influxdb/v1.5/introduction/getting-started/"&gt;here&lt;/a&gt;. You’ll also need &lt;a href="https://nodejs.org/en/"&gt;Node&lt;/a&gt; installed. If Node.js is not your cup of tea, there are plenty of other &lt;a href="https://docs.influxdata.com/influxdb/v1.5/tools/api_client_libraries/"&gt;client libraries&lt;/a&gt; to work with and several guides on using InfluxDB with other languages available, such as these posts on &lt;a href="https://www.influxdata.com/blog/getting-started-python-influxdb/"&gt;Python&lt;/a&gt; and &lt;a href="https://www.influxdata.com/blog/getting-started-with-the-influxdb-ruby-client/"&gt;Ruby&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;Set the Scene&lt;/h2&gt;

&lt;p&gt;&lt;img src="/images/legacy-uploads/surfers-683x1024.jpg" alt="Image of two surfers walking into the ocean" width="683" height="1024" /&gt;&lt;/p&gt;
&lt;figcaption&gt; How to get slotted&lt;/figcaption&gt;

&lt;p&gt;Let us imagine for a minute you have an inexplicable love for surfing. You find yourself in Hawaii on a journey following in &lt;a href="https://en.wikipedia.org/wiki/Duke_Kahanamoku"&gt;Duke’s&lt;/a&gt; footsteps and you’re trying to find the best surf spot. And the best&lt;strong&gt;&lt;i&gt; time &lt;/i&gt;&lt;/strong&gt;at which to surf said amazing spot. Makes sense to take a look at the tides right? Well, according to our trusty friend &lt;a href="https://en.wikipedia.org/wiki/Time_series"&gt;Wikipedia&lt;/a&gt;, ocean tides are a great example of time series data. They ebb and flow &lt;i&gt;over time&lt;/i&gt;(yes, I know I’m laying it on rather thick here). So let’s practice putting some sample tide data into InfluxDB using the node-influx library and see what happens.&lt;/p&gt;

&lt;p&gt;First things first, we need to install the node-influx library in the application folder where it will be used.&lt;/p&gt;

&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;$ npm install --save influx&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This adds the node-influx library to our node_modules; we also need to require the library into our server file, like so&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;const Influx = require('influx');&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;We’ll use the following &lt;a href="https://node-influx.github.io/class/src/index.js~InfluxDB.html#instance-constructor-constructor"&gt;constructor function&lt;/a&gt; to connect to a single InfluxDB instance and specify our connection options.&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;const influx = new Influx.InfluxDB({
  host: 'localhost',
  database: 'ocean_tides',
  schema: [
    {
      measurement: 'tide',
      fields: { height: Influx.FieldType.FLOAT },
      tags: ['unit', 'location']
    }
  ]
});&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;There are a few different options available here:&lt;/p&gt;
&lt;ul&gt;
 	&lt;li&gt;You could connect to a single host by passing the DSN as a string into the constructor argument, like so:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;const influx = new Influx.InfluxDB('http://user:password@host:8086/database')&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
 	&lt;li&gt;You could also pass in a full set of config details and specify properties such as username, password, database, host, port, and schema - that's what we did above.&lt;/li&gt;
 	&lt;li&gt;If you have multiple Influx nodes to connect to, you can pass in a cluster config. For example:
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;const client = new InfluxDB({
  database: 'my_database',
  username: 'duke_kahanamoku',
  password: 'aloha',
  hosts: [
    { host: 'db1.example.com' },
    { host: 'db2.example.com' },
  ]
  schema: [
    {
      measurement: 'tide',
      fields: { height: Influx.FieldType.FLOAT },
      tags: ['unit', 'location']
    }
  ]
})&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It’s worth noting here that within your schema design, you will need to designate the FieldType for your field values using &lt;a href="https://node-influx.github.io/typedef/index.html#static-typedef-FieldType"&gt;Influx.FieldType&lt;/a&gt; - they can be strings, integers, floats, or booleans.&lt;/p&gt;

&lt;h2&gt;Checking The Database&lt;/h2&gt;

&lt;p&gt;We can use &lt;a href="https://node-influx.github.io/class/src/index.js~InfluxDB.html#instance-method-getDatabaseNames"&gt;influx.getDatabaseNames()&lt;/a&gt; to first check if our database already exists. If it doesn’t, we can then use &lt;a href="https://node-influx.github.io/class/src/index.js~InfluxDB.html#instance-method-createDatabase"&gt;influx.createDatabase()&lt;/a&gt; to create our database. See below:&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;influx.getDatabaseNames()
  .then(names =&amp;gt; {
    if (!names.includes('ocean_tides')) {
      return influx.createDatabase('ocean_tides');
    }
  })
  .then(() =&amp;gt; {
    app.listen(app.get('port'), () =&amp;gt; {
      console.log(`Listening on ${app.get('port')}.`);
    });
    writeDataToInflux(hanalei);
    writeDataToInflux(hilo);
    writeDataToInflux(honolulu);
    writeDataToInflux(kahului);
  })
  .catch(error =&amp;gt; console.log({ error }));&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We are first grabbing all the databases available from our connected Influx instance, and then cycling through the returned array to see if any of the names match up with ‘ocean_tides’. If none do, then we create a new database with that name. The callback from that then writes our data into the database.&lt;/p&gt;

&lt;h2&gt;Writing Data to InfluxDB&lt;/h2&gt;

&lt;p&gt;Using &lt;a href="https://node-influx.github.io/class/src/index.js~InfluxDB.html#instance-method-writePoints"&gt;influx.writePoints()&lt;/a&gt;, we can write our data points into the database.&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;influx.writePoints([
      {
        measurement: 'tide',
        tags: {
          unit: locationObj.rawtide.tideInfo[0].units,
          location: locationObj.rawtide.tideInfo[0].tideSite,
        },
        fields: { height: tidePoint.height },
        timestamp: tidePoint.epoch,
      }
    ], {
      database: 'ocean_tides',
      precision: 's',
    })
    .catch(error =&amp;gt; {
      console.error(`Error saving data to InfluxDB! ${err.stack}`)
    });&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;To keep things simple, I just pulled in a few sample data files, then loop through them by location and write each data point to InfluxDB under the measurement name &lt;code class="language-markup"&gt;tide&lt;/code&gt; with &lt;code class="language-markup"&gt;location&lt;/code&gt; and &lt;code class="language-markup"&gt;unit&lt;/code&gt; tags (both are strings). There is only one field here, &lt;code class="language-markup"&gt;height&lt;/code&gt; and I send in a &lt;code class="language-markup"&gt;timestamp&lt;/code&gt; as well, although that is not technically required (it’s more accurate though). You can specify additional options such as the database to write to, the time precision, and the retention policy.&lt;/p&gt;

&lt;h2&gt;Querying the Database&lt;/h2&gt;

&lt;p&gt;We’ve learned how to write data into the database; now we need to know how to query for that data. It’s simple - we can use &lt;a href="https://node-influx.github.io/class/src/index.js~InfluxDB.html#instance-method-query"&gt;influx.query()&lt;/a&gt; and pass in our InfluxQL statement to retrieve the data we want.&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;influx.query(`
    select * from tide
    where location =~ /(?i)(${place})/
  `)
  .then( result =&amp;gt; response.status(200).json(result) )
  .catch( error =&amp;gt; response.status(500).json({ error }) );&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Here we are querying the database for any data from measurement &lt;code class="language-markup"&gt;tide&lt;/code&gt; where location contains the place name passed in (using a regular expression). If you’ve stored a lot of data, it’s a good idea to also limit your query to a certain time span. You can additionally pass in an &lt;a href="https://node-influx.github.io/typedef/index.html#static-typedef-IQueryOptions"&gt;options object&lt;/a&gt; (database, retention policy, and time precision) to the &lt;code class="language-markup"&gt;influx.query()&lt;/code&gt; method.&lt;/p&gt;

&lt;h2&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;That covers all the basics for the node-influx client library. Have a scan over the &lt;a href="https://node-influx.github.io/"&gt;docs&lt;/a&gt; and let us know if there are other use cases you’d like to hear about! I’ve also posted all this code in a &lt;a href="https://github.com/mschae16/node-influx-sample"&gt;repository on GitHub&lt;/a&gt; if you want to try it out for yourself. Questions and comments? Reach out to us on Twitter: &lt;a href="https://twitter.com/mschae16"&gt;@mschae16&lt;/a&gt; or &lt;a href="https://twitter.com/influxDB"&gt;@influxDB&lt;/a&gt;. Now go forth and find that monster wave surf’s up!&lt;/p&gt;
</description>
      <pubDate>Tue, 17 Apr 2018 13:20:39 -0700</pubDate>
      <link>https://www.influxdata.com/blog/getting-started-with-node-influx/</link>
      <guid isPermaLink="true">https://www.influxdata.com/blog/getting-started-with-node-influx/</guid>
      <category>Product</category>
      <category>Use Cases</category>
      <category>Developer</category>
      <category>Getting Started</category>
      <author>Margo Schaedel (InfluxData)</author>
    </item>
    <item>
      <title>Batch Processing vs. Stream Processing: What's the Difference?</title>
      <description>&lt;p&gt;If you’ve read DevRel &lt;a href="https://twitter.com/TheKaterTot" target="_blank" rel="noopener"&gt;Katy Farmer&lt;/a&gt;’s stellar post, &lt;a href="https://w2.influxdata.com/blog/kapacitor-cqs/" target="_blank" rel="noopener"&gt;Kapacitor and Continuous Queries: How To Decide Which Tool You Need&lt;/a&gt;, then you know that when our community talks, we listen. So, in alignment with that view and in honor of our very own Kapacitor Koala, let’s tackle another common community issue that has come to our attention: when should we use batch processing versus stream processing in our Kapacitor tasks?&lt;/p&gt;

&lt;p&gt;&lt;img class="wp-image-214447 size-full" src="/images/legacy-uploads/kapacitor-koala.png" alt="Image of InfluxData's Kapacitor Koala mascot" width="308" height="366" /&gt;&amp;lt;figcaption&amp;gt; Our famous Kapacitor Koala&amp;lt;/figcaption&amp;gt;&lt;/p&gt;

&lt;p&gt;Now, if you’ve no vague idea what Kapacitor is, I recommend doing a little light reading on it &lt;a href="https://w2.influxdata.com/time-series-platform/kapacitor/" target="_blank" rel="noopener"&gt;here&lt;/a&gt; and &lt;a href="https://docs.influxdata.com/kapacitor/v1.4/" target="_blank" rel="noopener"&gt;here&lt;/a&gt; just to get you up to speed.  Kapacitor, the final component of our TICK Stack, offers several capabilities such as data transformation, downsampling, and alerting. Kapacitor uses its own DSL, called &lt;a href="https://docs.influxdata.com/kapacitor/v1.4/tick/" target="_blank" rel="noopener"&gt;TICKscript&lt;/a&gt;, which allows you to define certain tasks, which can then be executed on your data—essentially, it’s processing your data for you.&lt;/p&gt;

&lt;p&gt;Here’s where it gets tricky though: how do you choose whether to process your data as a batch task or streaming task?&lt;/p&gt;
&lt;h2&gt;Batch Tasks&lt;/h2&gt;
&lt;p&gt;Let’s discuss batch tasks first. A batch is a collection of data points that have been grouped together within a specific time interval. Another term often used for this is a window of data. When running a batch task, Kapacitor queries InfluxDB periodically, thereby avoiding having to buffer much of your data in RAM. There are several cases where batch processing is the way to go:&lt;/p&gt;
&lt;ul&gt;
 	&lt;li&gt;Performing aggregate functions such as finding the mean, maximum, or minimum of a set interval of data.&lt;/li&gt;
 	&lt;li&gt;Cases where alerting doesn't need to run on every single data point (since state changes will probably not happen that often). You don't want to be inundated with alerts!&lt;/li&gt;
 	&lt;li&gt;Downsampling of your data takes a large collection of data points and only retains the most significant data (so you can still view overall trends in the data).&lt;/li&gt;
 	&lt;li&gt;Cases where a little extra latency won't severely impact your operation.&lt;/li&gt;
 	&lt;li&gt;Cases with a super-high throughput InfluxDB instance since Kapacitor cannot process data as quickly as it can be written to InfluxDB (this occurs more frequently with InfluxDB Enterprise clusters).&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Stream Tasks&lt;/h2&gt;
&lt;p&gt;On the other side, we have stream tasks. Stream tasks create subscriptions to InfluxDB so that every data point written to InfluxDB is also written to Kapacitor. One should note though that stream tasks use a high percentage of available memory, so memory availability is a key factor to take into consideration. Here’s where stream processing is most ideal:&lt;/p&gt;
&lt;ul&gt;
 	&lt;li&gt;If you want to transform each individual data point in real time (technically, this could also be run with a batch process but there's latency to consider).&lt;/li&gt;
 	&lt;li&gt;Cases where lowest possible latency is paramount to the operation. If alerts need to be triggered immediately, for example, running a stream task will ensure the least possible delay.&lt;/li&gt;
 	&lt;li&gt;Cases in which InfluxDB is handling high volume query load and you may want to alleviate some of the query pressure from InfluxDB.&lt;/li&gt;
 	&lt;li&gt;Stream tasks understand time by the data's timestamps; there are no race conditions for when exactly a given point will make it into a window or not. With batch tasks, on the other hand, it is possible for a data point to arrive late and be left out of its relevant window.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Another advantage some might see with writing stream tasks is the ease of use in having to define the task using only Kapacitor’s TICKscript, without having to delve into writing queries for InfluxDB. If you are comfortable with writing both, however, it’s probably going to be in your best interest to go with batch processing most of the time since it uses a lot less memory. An additional factor to consider is that Kapacitor is not limited to use only with InfluxDB. For example, if you want to send data straight from Telegraf over to Kapacitor, that will have to be done as a streaming task.&lt;/p&gt;
&lt;h2&gt;Key Takeaways&lt;/h2&gt;
&lt;ul&gt;
 	&lt;li&gt;Batch tasks query InfluxDB periodically, use limited memory, but can place additional query load on InfluxDB.&lt;/li&gt;
 	&lt;li&gt;Batch tasks are best used for performing aggregate functions on your data, downsampling, and processing large temporal windows of data.&lt;/li&gt;
 	&lt;li&gt;Stream tasks subscribe to writes from InfluxDB placing additional write load on Kapacitor, but can reduce query load on InfluxDB.&lt;/li&gt;
 	&lt;li&gt;Stream tasks are best used for cases where low latency is integral to the operation.&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;When our community talks, we listen.&lt;/blockquote&gt;
&lt;p&gt;We’d love to hear how your batch and stream tasks are going! Send us your comments, questions, issues, and blog ideas on our &lt;a href="https://community.influxdata.com/" target="_blank" rel="noopener"&gt;community site&lt;/a&gt; and feel free to reach out to us on Twitter:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://twitter.com/InfluxDB" target="_blank" rel="noopener"&gt;@InfluxDB&lt;/a&gt;
&lt;a href="https://twitter.com/mschae16" target="_blank" rel="noopener"&gt;@mschae16&lt;/a&gt;&lt;/p&gt;
</description>
      <pubDate>Thu, 29 Mar 2018 09:30:05 -0700</pubDate>
      <link>https://www.influxdata.com/blog/batch-processing-vs-stream-processing/</link>
      <guid isPermaLink="true">https://www.influxdata.com/blog/batch-processing-vs-stream-processing/</guid>
      <category>Product</category>
      <category>Use Cases</category>
      <category>Developer</category>
      <author>Margo Schaedel (InfluxData)</author>
    </item>
    <item>
      <title>Instrumenting Your Node/Express Application: Viewing Your Data</title>
      <description>&lt;p&gt;This post is the follow-up to &lt;a href="https://w2.influxdata.com/blog/instrumenting-your-node-express-application/" target="_blank" rel="noopener noreferrer"&gt;Instrumenting Your Node/Express Application&lt;/a&gt;. Here we will begin to explore some of the data that is being stored in InfluxDB and build out a dashboard in Chronograf. If you haven’t had a chance yet to begin instrumenting your Node.js applications, I recommend taking a look at my previous post to provide some context.&lt;/p&gt;

&lt;p&gt;When I last left off, we had some data being collected and stored in InfluxDB, as we could see from querying the database:&lt;/p&gt;

&lt;p&gt;&lt;img class="size-large wp-image-213699" src="/images/legacy-uploads/Screen-Shot-2018-03-06-at-5.46.05-PM-865x1024.png" alt="Terminal output from querying influxDB" width="865" height="1024" /&gt;&amp;lt;figcaption&amp;gt; Just some data…&amp;lt;/figcaption&amp;gt;&lt;/p&gt;

&lt;p&gt;Of course, it’s not entirely helpful just to see rows upon rows of numbers. It would be more sensible to view the data in a graph or table so we can more easily see trends in the data, and better yet to build out a full dashboard, so we can view all relevant data simultaneously. Using Chronograf to visualize our data offers just that. If you haven’t installed Chronograf yet, here’s a nifty &lt;a href="https://docs.influxdata.com/chronograf/v1.4/introduction/getting-started/"&gt;guide&lt;/a&gt; that will get you up and running with all the different components of the &lt;a href="https://w2.influxdata.com/time-series-platform/"&gt;TICK Stack&lt;/a&gt;—Telegraf, InfluxDB, Kapacitor, and Chronograf.&lt;/p&gt;

&lt;p&gt;For this section, I’ll be using the instrumented version of the Node.js application, AmazonBay, which you can clone down from GitHub &lt;a href="https://github.com/mschae16/AmazonBay-Instrumented"&gt;here&lt;/a&gt;. It’s using this &lt;a href="https://github.com/RuntimeTools/appmetrics" target="_blank" rel="noopener noreferrer"&gt;Node Metrics library&lt;/a&gt; to send data via Telegraf into InfluxDB. Once you’ve cloned it down and set everything up, ensure your server is running with &lt;code class="language-markup"&gt;node server.js&lt;/code&gt; so that Telegraf can start collecting metrics.&lt;/p&gt;

&lt;p&gt;Let’s start Chronograf and navigate to our dashboards section, where we can start building out a proper dashboard to gain some insight into the data we’re collecting. You’ll need to create a dashboard first and name it— mine is “Instrumented-AmazonBay” for the sake of convenience.&lt;/p&gt;

&lt;p&gt;&lt;img class="size-large wp-image-213875" src="/images/legacy-uploads/Screen-Shot-2018-03-13-at-9.13.08-PM-1024x367.png" alt="Screenshot of Chronograf Dashboard Display" width="1024" height="367" /&gt;&amp;lt;figcaption&amp;gt; Let’s create a new dashboard&amp;lt;/figcaption&amp;gt;&lt;/p&gt;

&lt;p&gt;As we start visualizing, let’s take a moment to consider what metrics we’re collecting and why.&lt;/p&gt;
&lt;h3&gt;CPU Usage&lt;/h3&gt;
&lt;p&gt;It’s generally a good idea to keep track of an application’s CPU usage over time. Although Node.js apps typically consume a minimal amount of CPU, having this data on-hand affords visibility into the health of your application, by highlighting instances that deviate from the norm. Having the capability to ascertain what, if any, operations are causing high CPU usage is certainly a step towards understanding the performance of your application.&lt;/p&gt;

&lt;p&gt;In the case of AmazonBay, we are monitoring the CPU percentage (a value between 0-1) of our process (the percentage of CPU used by the application) and our system (the percentage of CPU used by the system as a whole). We can chart both as seen below:&lt;/p&gt;

&lt;p&gt;&lt;img class="size-large wp-image-213877" src="/images/legacy-uploads/Screen-Shot-2018-03-13-at-9.34.13-PM-1024x732.png" alt="Screenshot of Chronograf dashboard displaying Mean CPU Usage for System and Process" width="1024" height="732" /&gt;&amp;lt;figcaption&amp;gt; CPU Usage of both Process and System&amp;lt;/figcaption&amp;gt;&lt;/p&gt;

&lt;p&gt;I built the query through the Chronograf UI, but edited it to change percentage to a value between 0-100 as so:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;SELECT (mean("process")*100) AS "mean_process" FROM "telegraf"."autogen"."cpu_percentage" WHERE time &amp;gt; :dashboardTime: GROUP BY :interval: FILL(null)&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Event Loop Latency&lt;/h3&gt;
&lt;p&gt;Because of Node.js’s nonblocking, single-threaded nature, it is extraordinarily fast in handling a multitude of events quickly and asynchronously. The event loop is responsible for this, and it would therefore behoove one to recognize and pinpoint any latencies present within the event loop that could be causing regression in application performance. Longer-lasting latency exacerbates each cycle of the event loop and could eventually slow down the app to a state of purgatory. If the server witnesses an increase in load, for example, this can lead to an increase in tasks per event loop, which will effect longer response times for the end user. Collecting data on these latencies can assist in the decision of whether to scale up the number of processes running the application and return performance levels to equilibrium.&lt;/p&gt;

&lt;p&gt;For our measurement of event loop latency, we have access to the min, max, and average latency times in milliseconds. There are several visualization options available in Chronograf, including line and stacked graphs, step-plots, bar graphs, and gauges, all available when you switch from the &lt;code class="language-markup"&gt;Queries&lt;/code&gt; section to the &lt;code class="language-markup"&gt;Visualizations&lt;/code&gt; section.&lt;/p&gt;

&lt;p&gt;&lt;img class="size-large wp-image-213919" src="/images/legacy-uploads/Screen-Shot-2018-03-14-at-9.18.35-AM-1024x311.png" alt="Screenshot of Chronograf Visualizations Options" width="1024" height="311" /&gt;&amp;lt;figcaption&amp;gt; Various Visualization Types are available in Chronograf&amp;lt;/figcaption&amp;gt;&lt;/p&gt;

&lt;p&gt;Below you can see event loop latency depicted in various visualizations:&lt;/p&gt;

&lt;p&gt;&lt;img class="size-large wp-image-213879" src="/images/legacy-uploads/Screen-Shot-2018-03-13-at-9.56.34-PM-1024x839.png" alt="" width="1024" height="839" /&gt;&amp;lt;figcaption&amp;gt; Minimum, Average, and Maximum Sampled Event Loop Latency (in milliseconds)&amp;lt;/figcaption&amp;gt;&lt;/p&gt;
&lt;h3&gt;Garbage Collection, Heap Usage, and Memory Leaks&lt;/h3&gt;
&lt;p&gt;Memory leaks are an oft-cited complaint by Node.js developers, as it is usually tricky to determine the point of causation. They occur when objects are referenced for too long, when variables are stored past their point of use. Recognizing their existence early on is integral to monitoring the health of your application, and can be achieved by tracking the app’s heap usage (a segment of memory allocated for storing objects, strings and closures) and/or its garbage collection (the process of freeing up unused memory) rates. For instance, a steady growth in heap usage will eventually max out at the 1.5GB default restriction required by Node.js and cause a service crash and restart on the process. Similarly, you can look for patterns within garbage collection rates, for as extraneous objects accumulate within memory, the time spent in the garbage collection process likewise increases. Of course, once you’ve found yourself with a memory leak, it’s a rather tedious process trying to pinpoint the root cause, usually involving comparing differently timed heap snapshots of your application to see what has changed between the two.&lt;/p&gt;

&lt;p&gt;We will monitor both the heap usage and the garbage collection rates in this instance. See below:&lt;/p&gt;

&lt;p&gt;&lt;img class="size-large wp-image-213881" src="/images/legacy-uploads/Screen-Shot-2018-03-13-at-10.14.34-PM-1024x247.png" alt="Screenshot of Chronograf Dashboard depicting GC Cycle Duration and Heap Usage (MB)" width="1024" height="247" /&gt;&amp;lt;figcaption&amp;gt; GC Cycle Duration and Heap Usage (MB)&amp;lt;/figcaption&amp;gt;&lt;/p&gt;

&lt;p&gt;For heap usage in particular, I altered the query to display in megabytes rather than the default (bytes):&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;SELECT ("used"/1000000) FROM "telegraf"."autogen"."gc" WHERE time &amp;gt; :dashboardTime:&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;HTTP Requests&lt;/h3&gt;
&lt;p&gt;The duration of HTTP requests is an important metric especially because it most often directly involves the end user. As users have become more impatient than ever, slow response times can heavily detriment the success of an application. Monitoring the duration of these requests presents awareness on whether users are able to interact with the application quickly and efficiently. The faster things are, the higher user satisfaction will be, plain and simple.&lt;/p&gt;

&lt;p&gt;Here, you’ll see I built out a stacked graph visualizing mean HTTP request/response duration in milliseconds, grouped by different URLs:&lt;/p&gt;

&lt;p&gt;&lt;img class="size-large wp-image-213882" src="/images/legacy-uploads/Screen-Shot-2018-03-13-at-10.23.50-PM-1024x725.png" alt="" width="1024" height="725" /&gt;&amp;lt;figcaption&amp;gt; HTTP Request/Response Duration (ms)&amp;lt;/figcaption&amp;gt;&lt;/p&gt;
&lt;h3&gt;Database Queries&lt;/h3&gt;
&lt;p&gt;In this particular application, the inventory and order history are stored using the PostgreSQL relational database and at various points in the application, one has to query the database. This falls under the category of an external dependency or any system with which your application interacts. There are others beside databases—third-party APIs, web services, legacy systems—and although we cannot necessarily change the code running within these services directly, these dependencies are nevertheless important to the success of the application and therefore worth tracking, if only to be able to differentiate between problems arising within the application and problems without. However your application communicates with third-party applications, internal or external, the latency in waiting for the response can potentially impact the performance of your application and your customer experience. Measuring and optimizing these response times can help solve for these bottlenecks.&lt;/p&gt;

&lt;p&gt;We’ve tracked the duration of our queries to the Postgres database and are depicting them in a line/stat graph as so:&lt;/p&gt;

&lt;p&gt;&lt;img class="size-large wp-image-213883" src="/images/legacy-uploads/Screen-Shot-2018-03-13-at-10.27.25-PM-1024x489.png" alt="Screenshot of Chronograf dashboard depicting Postgres Query Duration (ms)" width="1024" height="489" /&gt;&amp;lt;figcaption&amp;gt; Postgres Query Duration (ms)&amp;lt;/figcaption&amp;gt;&lt;/p&gt;
&lt;h3&gt;Summary&lt;/h3&gt;
&lt;p&gt;Once you pull everything together, you have a full dashboard at your disposal monitoring the health of your Node.js application:&lt;/p&gt;

&lt;p&gt;&lt;img class="size-large wp-image-213885" src="/images/legacy-uploads/Screen-Shot-2018-03-13-at-10.34.31-PM-1024x546.png" alt="Screenshot of fully finished Chronograf Dashboard" width="1024" height="546" /&gt;&amp;lt;figcaption&amp;gt; Success!&amp;lt;/figcaption&amp;gt;&lt;/p&gt;

&lt;p&gt;That just about sums it up for this post. I’d love to hear how you’re instrumenting your Node.js applications, and how you’re visualizing your metrics and events! Thanks for coming along on this journey and feel free to reach out to me via &lt;a href="mailto:margo@influxdata.com"&gt;margo@influxdata.com&lt;/a&gt;, or on &lt;a href="https://twitter.com/mschae16" target="_blank" rel="noopener noreferrer"&gt;Twitter&lt;/a&gt; with any questions and/or comments. Happy dashboarding!&lt;/p&gt;
</description>
      <pubDate>Wed, 14 Mar 2018 07:30:58 -0700</pubDate>
      <link>https://www.influxdata.com/blog/instrumenting-your-node-express-application-viewing-your-data/</link>
      <guid isPermaLink="true">https://www.influxdata.com/blog/instrumenting-your-node-express-application-viewing-your-data/</guid>
      <category>Product</category>
      <category>Use Cases</category>
      <category>Developer</category>
      <author>Margo Schaedel (InfluxData)</author>
    </item>
  </channel>
</rss>
