<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <title>InfluxData Blog - Stuart Carnie</title>
    <description>Posts by Stuart Carnie on the InfluxData Blog</description>
    <link>https://www.influxdata.com/blog/author/stuart-carnie/</link>
    <language>en-us</language>
    <lastBuildDate>Thu, 31 May 2018 11:50:29 -0700</lastBuildDate>
    <pubDate>Thu, 31 May 2018 11:50:29 -0700</pubDate>
    <ttl>1800</ttl>
    <item>
      <title>Schema Queries in Flux (formerly IFQL)</title>
      <description>&lt;p&gt;InfluxQL facilitates schema exploration via a number of meta queries, which include &lt;code class="language-markup"&gt;SHOW MEASUREMENTS&lt;/code&gt;, &lt;code class="language-markup"&gt;SHOW TAG KEYS&lt;/code&gt;, &lt;code class="language-markup"&gt;SHOW TAG VALUES&lt;/code&gt; and &lt;code class="language-markup"&gt;SHOW FIELD KEYS&lt;/code&gt;. Flux (formerly IFQL) has united these concepts, such that schema is made up of tags keys and values. This unification provides greater flexibility with which a user may explore a schema, as we will show in the remainder of this post.&lt;/p&gt;
&lt;h2&gt;InfluxQL ? Flux (formerly IFQL)&lt;/h2&gt;
&lt;p&gt;This section demonstrates translations of InfluxQL meta queries to their Flux equivalents.&lt;/p&gt;
&lt;h3&gt;&lt;code class="language-markup"&gt;SHOW MEASUREMENTS&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;Measurement names are aggregated within the &lt;code class="language-markup"&gt;_measurement&lt;/code&gt; tag key. Therefore, we want to ask Flux to give us the distinct values for &lt;code class="language-markup"&gt;_measurement&lt;/code&gt;.&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;from(db:"foo")
  |&amp;gt; range(start:-24h)
  |&amp;gt; group(by:["_measurement"])
  |&amp;gt; distinct(column:"_measurement")
  |&amp;gt; group(none:true)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Typing this all the time may become tedious, so we can write a helper function that queries a specified database for the last 24 hours as follows:&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;showMeasurements = (db) =&amp;gt; from(db:db) 
  |&amp;gt; range(start:-24h)
  |&amp;gt; group(by:["_measurement"])
  |&amp;gt; distinct(column:"_measurement")
  |&amp;gt; group(none:true)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Resulting in a greatly simplified query to show measurements:&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;showMeasurements(db:"foo")&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We plan to formalize a number of helper functions for querying metadata in the near future.&lt;/p&gt;

&lt;p&gt;InfluxQL meta queries cannot be restricted by time, so it is worth calling out the use of &lt;code class="language-markup"&gt;range&lt;/code&gt; to restrict the results to only those measurements with data in the last 24 hours.&lt;/p&gt;
&lt;h3 class="line-numbers"&gt;&lt;code class="language-markup"&gt;SHOW TAG KEYS FROM cpu&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;List all the tag keys for the measurement &lt;code class="language-markup"&gt;cpu&lt;/code&gt;.&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;from(db:"foo")
    |&amp;gt; range(start:-24h)
    |&amp;gt; filter(fn:(r) =&amp;gt; r._measurement == "cpu")
    |&amp;gt; keys()&lt;/code&gt;&lt;/pre&gt;
&lt;h3 class="line-numbers"&gt;&lt;code class="language-markup"&gt;SHOW TAG VALUES FROM cpu WITH KEY = "host"&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;List all the distinct values for a specific tag (&lt;code class="language-markup"&gt;host&lt;/code&gt;) in measurement &lt;code class="language-markup"&gt;cpu&lt;/code&gt;.&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;from(db:"foo")
    |&amp;gt; range(start:-24h)
    |&amp;gt; filter(fn:(r) =&amp;gt; r._measurement == "cpu")
    |&amp;gt; group(by:["host"])
    |&amp;gt; distinct(column:"host")
    |&amp;gt; group(none:true)&lt;/code&gt;&lt;/pre&gt;
&lt;h3 class="line-numbers"&gt;&lt;code class="language-markup"&gt;SHOW FIELD KEYS FROM cpu&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;List all the fields for measurement &lt;code class="language-markup"&gt;cpu&lt;/code&gt;, which are aggregated under the &lt;code class="language-markup"&gt;_field&lt;/code&gt; tag key.&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;from(db:"foo")
    |&amp;gt; range(start:-24h)
    |&amp;gt; filter(fn:(r) =&amp;gt; r._measurement == "cpu")
    |&amp;gt; group(by:["_field"])
    |&amp;gt; distinct(column:"_field")
    |&amp;gt; group(none:true)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The astute reader will notice this is the same query as 3., with the exception of using the &lt;code class="language-markup"&gt;_field&lt;/code&gt; tag key.&lt;/p&gt;
&lt;h2&gt;Exploring Schema&lt;/h2&gt;
&lt;p&gt;In this section, we walk through a series of queries a user might perform as they explore their schema.&lt;/p&gt;
&lt;h3&gt;1. Show the available tag keys&lt;/h3&gt;
&lt;p&gt;This query uses the &lt;code class="language-markup"&gt;keys&lt;/code&gt; function to list the distinct tag keys.&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;from(db:"foo")
    |&amp;gt; range(start:-1h)
    |&amp;gt; group(none:true)
    |&amp;gt; keys(except:["_time","_value","_start","_stop"])&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;Example Output:&lt;/strong&gt;&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;_value:string
-------------
_field
_measurement
bank
dir
host
id
interface
tag0
tag1
tag2
tag3&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;2. Expand the &lt;code class="language-markup"&gt;host&lt;/code&gt; tag to see available values&lt;/h3&gt;
&lt;p&gt;This query groups the data by the &lt;code class="language-markup"&gt;host&lt;/code&gt; tag and outputs the distinct values for the host column.&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;from(db:"foo")
    |&amp;gt; range(start:-1h)
    |&amp;gt; group(by:["host"])
    |&amp;gt; distinct(column:"host")
    |&amp;gt; group(none:true)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;Example Output:&lt;/strong&gt;&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;_value:string
-------------
host2
host1&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;3. Expand &lt;code class="language-markup"&gt;host1&lt;/code&gt; tag to see available keys&lt;/h3&gt;
&lt;p&gt;This query filters by &lt;code class="language-markup"&gt;host == "host1"&lt;/code&gt; and shows the subset of available keys.&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;from(db:"foo")
    |&amp;gt; range(start:-1h)
    |&amp;gt; filter(fn:(r) =&amp;gt; r.host == "host1")
    |&amp;gt; group(none:true)
    |&amp;gt; keys(except:["_time","_value","_start","_stop", "host"]) // &amp;lt;- note host is added here, since we're already querying it&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;Example Output:&lt;/strong&gt;&lt;/p&gt;
&lt;pre class="line-numbers"&gt;&lt;code class="language-markup"&gt;_value:string
-------------
_field
_measurement
bank
dir
id
interface&lt;/code&gt;&lt;/pre&gt;
</description>
      <pubDate>Thu, 31 May 2018 11:50:29 -0700</pubDate>
      <link>https://www.influxdata.com/blog/schema-queries-in-ifql</link>
      <guid isPermaLink="true">https://www.influxdata.com/blog/schema-queries-in-ifql</guid>
      <category>Use Cases</category>
      <category>Developer</category>
      <category>Product</category>
      <author>Stuart Carnie (InfluxData)</author>
    </item>
    <item>
      <title>InfluxData is Building a Fast Implementation of Apache Arrow in Go Using c2goasm and SIMD</title>
      <description>&lt;p&gt;InfluxData is pleased to announce our contribution to the &lt;a href="https://arrow.apache.org/" rel="nofollow"&gt;Apache Arrow&lt;/a&gt; project. Essentially, we are contributing work that we already started: the development of a Go implementation of Apache Arrow. We believe in open source and are committed to participating in and contributing to the open source community in meaningful ways. We developed an interest in Apache Arrow for a number of reasons which we describe in more detail below, and contributing our initial efforts to the Apache Software Foundation ensures that the community maintains the focus within that repository.&lt;/p&gt;

&lt;p&gt;Apache Arrow specifies a standardized, language-independent, columnar memory format for flat and hierarchical data that is organized for efficient, analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and inter-process communication. &lt;a href="https://www.influxdata.com/blog/ifql-and-the-future-of-influxdata/"&gt;As we have been working on developing a new query processing engine&lt;/a&gt; and language for InfluxDB, currently known as &lt;a href="https://github.com/influxdata/flux" target="_blank" rel="noopener"&gt;Flux f.k.a. IFQL&lt;/a&gt;, Arrow provides a superior way to both exchange data between the database and the query processing engine while also providing an additional means for InfluxData to participate in a broader ecosystem of data processing and analysis tools.&lt;/p&gt;
&lt;h2&gt;Why Arrow?&lt;/h2&gt;
&lt;p&gt;One of many goals for Flux f.k.a. IFQL is to enable new ways to efficiently query and analyze your data using industry-standard tools. One such example is &lt;em&gt;&lt;a href="https://pandas.pydata.org/" rel="nofollow"&gt;pandas&lt;/a&gt;&lt;/em&gt;, an open source library that provides advanced features for data analytics and visualization. Another is &lt;a href="https://spark.apache.org/" rel="nofollow"&gt;Apache Spark&lt;/a&gt;, a scalable data processing engine. We discovered these and many other open source projects, and commercial software offerings, are adopting Apache Arrow to address the challenge of sharing columnar data efficiently. The Apache Arrow mission statement defines a number of goals that resonated with the team at InfluxData:&lt;/p&gt;
&lt;blockquote&gt;Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication. Languages currently supported include C, C++, Java, JavaScript, Python, and Ruby.&lt;/blockquote&gt;
&lt;p&gt;Specifically:&lt;/p&gt;
&lt;ul&gt;
 	&lt;li&gt;&lt;strong&gt;Standardized&lt;/strong&gt;: Many projects in the data science and analytics space are adopting Arrow as it addresses a common set of design problems including how to efficiently exchange large data sets. Examples of early adopters include &lt;em&gt;pandas&lt;/em&gt; and Spark, and the list &lt;a href="http://arrow.apache.org/powered_by/" rel="nofollow"&gt;continues to grow&lt;/a&gt;.&lt;/li&gt;
 	&lt;li&gt;&lt;strong&gt;Performance&lt;/strong&gt;: The specification is clear that performance is the &lt;em&gt;raison d'être&lt;/em&gt;. Arrow data structures are designed to work efficiently on modern processors, enabling the use of features like single-instruction, multiple-data (SIMD).&lt;/li&gt;
 	&lt;li&gt;&lt;strong&gt;Language-Independent&lt;/strong&gt;: Mature libraries exist for C/C++, Python, Java and Javascript with libraries for Ruby and Go in active development. More libraries mean more ways to work with your data.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We also recognize Apache Arrow as an opportunity to participate and contribute to a community that will face similar challenges. &lt;em&gt;A problem shared is a problem halved.&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;&lt;a id="user-content-apache-arrow-at-influxdata" class="anchor" href="https://gist.github.com/stuartcarnie/8e2b6ee117d320c4e5045deb947ba824#apache-arrow-at-influxdata"&gt;&lt;/a&gt;Apache Arrow at InfluxData&lt;/h2&gt;
&lt;p&gt;We have identified a few areas where InfluxDB will benefit from Apache Arrow:&lt;/p&gt;
&lt;ul&gt;
 	&lt;li&gt;represent in-memory TSM columnar data,&lt;/li&gt;
 	&lt;li&gt;perform aggregations using SIMD math kernels and&lt;/li&gt;
 	&lt;li&gt;the data communication protocol between InfluxDB and Flux f.k.a. IFQL.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For Flux f.k.a. IFQL:&lt;/p&gt;
&lt;ul&gt;
 	&lt;li&gt;represent block data structures,&lt;/li&gt;
 	&lt;li&gt;perform aggregations using SIMD math kernels and&lt;/li&gt;
 	&lt;li&gt;the primary communication protocol to clients.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In the future, we expect that a user could create a &lt;a href="http://jupyter.org/" rel="nofollow"&gt;Jupyter Notebook&lt;/a&gt;, execute an Flux f.k.a. IFQL query in Python and manipulate the data efficiently in &lt;em&gt;pandas&lt;/em&gt;, with little overhead.&lt;/p&gt;
&lt;h2&gt;&lt;a id="user-content-apache-arrow-in-go" class="anchor" href="https://gist.github.com/stuartcarnie/8e2b6ee117d320c4e5045deb947ba824#apache-arrow-in-go"&gt;&lt;/a&gt;Apache Arrow in Go&lt;/h2&gt;
&lt;p&gt;At the time of writing, the Go implementation has support for the following features:&lt;/p&gt;
&lt;h3&gt;&lt;a id="user-content-memory-management" class="anchor" href="https://gist.github.com/stuartcarnie/8e2b6ee117d320c4e5045deb947ba824#memory-management"&gt;&lt;/a&gt;Memory Management&lt;/h3&gt;
&lt;ul&gt;
 	&lt;li&gt;Allocations are 64-byte aligned and padded to 8-bytes&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;a id="user-content-array-and-builder-support" class="anchor" href="https://gist.github.com/stuartcarnie/8e2b6ee117d320c4e5045deb947ba824#array-and-builder-support"&gt;&lt;/a&gt;Array and Builder Support&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Primitive Types&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
 	&lt;li&gt;Signed and unsigned 8, 16, 32 and 64 bit integers&lt;/li&gt;
 	&lt;li&gt;32 and 64 bit floats&lt;/li&gt;
 	&lt;li&gt;Packed LSB booleans&lt;/li&gt;
 	&lt;li&gt;Variable-length binary arrays&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Parametric Types&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
 	&lt;li&gt;Timestamp&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;a id="user-content-type-metadata" class="anchor" href="https://gist.github.com/stuartcarnie/8e2b6ee117d320c4e5045deb947ba824#type-metadata"&gt;&lt;/a&gt;Type Metadata&lt;/h3&gt;
&lt;ul&gt;
 	&lt;li&gt;Data types&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;a id="user-content-simd-math-kernels" class="anchor" href="https://gist.github.com/stuartcarnie/8e2b6ee117d320c4e5045deb947ba824#simd-math-kernels"&gt;&lt;/a&gt;SIMD Math Kernels&lt;/h3&gt;
&lt;ul&gt;
 	&lt;li&gt;SIMD optimized &lt;code class="language-markup"&gt;Sum&lt;/code&gt; operations for 64-bit float, int and unsigned int arrays&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;a id="user-content-simd-your-go-with-no-assembly-required-using-this-one-weird-trick" class="anchor" href="https://gist.github.com/stuartcarnie/8e2b6ee117d320c4e5045deb947ba824#simd-your-go-with-no-assembly-required-using-this-one-weird-trick"&gt;&lt;/a&gt;SIMD Your Go with No Assembly Required, Using This One Weird Trick!&lt;/h3&gt;
&lt;p&gt;Before we share the magic, let’s delve a little deeper into why SIMD or single-instruction, multiple-data is relevant. It is no accident that most data structures in Apache Arrow occupy continuous blocks of memory as arrays or vectors. Using special instructions, many of today’s CPUs can process tightly packed data like this in parallel, improving the performance of specific algorithms and operations. Even better, compilers are built with a host of advanced optimizations, such as &lt;a href="https://en.wikipedia.org/wiki/Automatic_vectorization" rel="nofollow"&gt;auto vectorization&lt;/a&gt;, to take advantage of these features without the developer having to write any assembly. During compilation, the compiler may identify loops that process arrays as candidates for auto vectorization, and generate more efficient machine code utilizing SIMD instructions. Alas, the Go compiler lacks these optimizations, leaving us to fend for ourselves. We could write these routines in assembly, but that is hard enough without having to use Go’s esoteric Plan 9 syntax. To make matters worse, in order to write optimal code in assembly for a specific architecture, you must be familiar with other issues like instruction scheduling, data dependencies, &lt;a href="https://software.intel.com/en-us/articles/avoiding-avx-sse-transition-penalties" rel="nofollow"&gt;AVX-SSE transition penalties&lt;/a&gt; and more.&lt;/p&gt;
&lt;h3&gt;&lt;a id="user-content-clang--c2goasm--?" class="anchor" href="https://gist.github.com/stuartcarnie/8e2b6ee117d320c4e5045deb947ba824#clang--c2goasm--%EF%B8%8F"&gt;&lt;/a&gt;clang + c2goasm = ??&lt;/h3&gt;
&lt;p&gt;&lt;code class="language-markup"&gt;c2goasm&lt;/code&gt;, developed by minio, is an awesome command-line tool that transforms the assembly output of functions written in C/C++ into something the Go Plan 9 assembler will understand. These are &lt;em&gt;not&lt;/em&gt; the same as CGO and just as efficient to call as any other Go function. A caveat of routines written in Go assembly is they cannot be inlined, so it is important they do enough work to negate the overhead of the function call. The examples in the release announcement make use of intrinsics, which are compiler extensions that provide access to processor-specific features like wide data types (&lt;code class="language-markup"&gt;__m256&lt;/code&gt;) and functions that map to processor instructions (&lt;code class="language-markup"&gt;_mm256_load_ps&lt;/code&gt;). Using intrinsics vs. writing pure assembly allows the developer to mix high-level C code with low-level processor features, whilst still allowing the compiler to perform a limited set of optimizations.&lt;/p&gt;

&lt;p&gt;Our &lt;a href="https://github.com/stuartcarnie/go-simd"&gt;first experiment&lt;/a&gt; was to take a Go function that summed a slice 64-bit floats and determine if we could improve it with c2goasm. We benchmarked 1,000 element slices, as they match the maximum size of a TSM block in InfluxDB. The benchmarks were collected on an early 2017 MacBook Pro running at 2.9GHz.&lt;/p&gt;

&lt;p&gt;The reference implementation in Go ran at 1200 ns/op or 6,664 MB/s:&lt;/p&gt;
&lt;pre class="language-markup"&gt;&lt;code class="language-markup"&gt;func SumFloat64(buf []float64) float64 {
	acc := float64(0)
	for i := range buf {
		acc += buf[i]
	}
	return acc
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Following a similar approach to the c2goasm post, we used AVX2 intrinsics to produce this abridged implementation:&lt;/p&gt;
&lt;div class="highlight highlight-source-c"&gt;
&lt;pre class="language-markup"&gt;&lt;code class="language-markup"&gt;void sum_float64_avx_intrinsics(double buf[], size_t len, double *res) {
    __m256d acc = _mm256_set1_pd(0);
    for (int i = 0; i &amp;lt; len; i += 4) {
        __m256d v = _mm256_load_pd(&amp;amp;buf[i]);
        acc = _mm256_add_pd(acc, v);
    }

    acc = _mm256_hadd_pd(acc, acc); // a[0] = a[0] + a[1], a[2] = a[2] + a[3]
    *res = _mm256_cvtsd_f64(acc) + _mm_cvtsd_f64(_mm256_extractf128_pd(acc, 1));
}&lt;/code&gt;&lt;/pre&gt;
This version summed the 1,000 64-bit floats (&lt;code class="language-markup"&gt;double&lt;/code&gt; in C) at 255 ns/op or a rate of 31,369 MB/s &amp;ndash; 4.7× is a handy improvement. Intel x86 AVX2 intrinsics are a specific set of extensions for working with 256-bits of data or 4×64-bit float values using single instructions. There is a bit going on here, so let's summarize what the code does:

&lt;/div&gt;
&lt;div class="highlight highlight-source-c"&gt;
&lt;ul&gt;
 	&lt;li&gt;initialize the accumulator &lt;code class="language-markup"&gt;acc&lt;/code&gt;, a data type representing 4×64-bit float elements, to &lt;code class="language-markup"&gt;0&lt;/code&gt;&lt;/li&gt;
 	&lt;li&gt;for each iteration:&lt;/li&gt;
 	&lt;li&gt;load the next 4 64-bit float elements from &lt;code class="language-markup"&gt;buf&lt;/code&gt; into &lt;code class="language-markup"&gt;v&lt;/code&gt;&lt;/li&gt;
 	&lt;li&gt;add the corresponding elements of &lt;code class="language-markup"&gt;v&lt;/code&gt; to &lt;code class="language-markup"&gt;acc&lt;/code&gt;, i.e. &lt;code class="language-markup"&gt;acc[0] += v[0]&lt;/code&gt;, &lt;code class="language-markup"&gt;acc[1] += v[1]&lt;/code&gt;, &lt;code class="language-markup"&gt;acc[2] += v[2]&lt;/code&gt; and &lt;code class="language-markup"&gt;acc[3] += v[3]&lt;/code&gt;&lt;/li&gt;
 	&lt;li&gt;if more elements in &lt;code class="language-markup"&gt;buf&lt;/code&gt;, increment by 4 and restart loop&lt;/li&gt;
 	&lt;li&gt;sum &lt;code class="language-markup"&gt;acc[0]+acc[1]+acc[2]+acc[3]&lt;/code&gt;&lt;/li&gt;
 	&lt;li&gt;convert to a double and return the value&lt;/li&gt;
&lt;/ul&gt;
It is worth noting that there are a couple of cons to using intrinsics:
&lt;ul&gt;
 	&lt;li&gt;the cognitive load required to understand this function is much higher than the Go or plain C version&lt;/li&gt;
 	&lt;li&gt;we'll need a separate implementation using SSE4 intrinsics, as calling this function on a machine that does not support AVX2 extensions will &lt;code class="language-markup"&gt;SEGFAULT&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
There are situations where using intrinsics or writing assembly is the best option, but for a simple loop like this, we decided to explore an alternative. Earlier, we mentioned auto-vectorization, so let's see what an optimizing compiler can do with a plain C version, similar to Go:
&lt;pre class="language-markup"&gt;&lt;code class="language-markup"&gt;void sum_float64(double buf[], int len, double *res) {
    double acc = 0.0;
    for(int i = 0; i &amp;lt; len; i++) {
        acc += buf[i];
    }
    *res = acc;
}&lt;/code&gt;&lt;/pre&gt;
1,000 floats summed in just 58ns or a rate of 137 GB/s&lt;sup&gt;1&lt;/sup&gt;. Not too shabby, when all we did was specify a few compiler flags to enable optimizations, including loop vectorization, loop unrolling and to generate AVX2 instructions. By writing a portable C/C++ version, we can generate an SSE4 version or target a completely different architecture like ARM64, with only minor alterations to the compiler flags; a benefit that cannot be overstated.

&lt;sup&gt;1 &lt;/sup&gt;According to the specs for the &lt;a href="https://ark.intel.com/products/88972/Intel-Core-i7-6920HQ-Processor-8M-Cache-up-to-3_80-GHz" rel="nofollow"&gt;Intel Core i7 6920HQ&lt;/a&gt;, it has a maximum memory bandwidth of 34.1 GB/s. 137 GB/s is well above this number, so what is going on? &lt;em&gt;Caching&lt;/em&gt;. We can attribute the blazing speed to the data residing in one of the processor caches. Therefore, &lt;em&gt;the sooner you operate on data read from main memory, the more likely you will benefit from caching&lt;/em&gt;.
&lt;h3&gt;&lt;a id="user-content-automating-the-code-generation" class="anchor" href="https://gist.github.com/stuartcarnie/8e2b6ee117d320c4e5045deb947ba824#automating-the-code-generation"&gt;&lt;/a&gt;Automating the Code Generation&lt;/h3&gt;
There are a few steps required to go from the C source to the final Go assembly:
&lt;ol&gt;
 	&lt;li&gt;execute &lt;code class="language-markup"&gt;clang&lt;/code&gt; with the correct compiler flags to generate the base assembly, producing &lt;code class="language-markup"&gt;foo_ARCH.s&lt;/code&gt;&lt;/li&gt;
 	&lt;li&gt;execute &lt;code class="language-markup"&gt;c2goasm&lt;/code&gt; to transform &lt;code class="language-markup"&gt;foo_ARCH.s&lt;/code&gt; into Go assembly&lt;/li&gt;
 	&lt;li&gt;repeat 1 and 2 for each target architecture (e.g. SSE4, AVX2 or ARM64)&lt;/li&gt;
&lt;/ol&gt;
If "A" changes, build "B"; if "B" changes, build "C". Sounds like a job for &lt;code class="language-markup"&gt;make&lt;/code&gt; and that is exactly what we did. Any time we update the C source, we simply run &lt;code class="language-markup"&gt;make generate&lt;/code&gt; to update the dependent files. We also check the generated assembly files in to the repository to ensure the Go package is &lt;em&gt;go gettable&lt;/em&gt;.
&lt;h3&gt;&lt;a id="user-content-using-these-optimizations-in-go" class="anchor" href="https://gist.github.com/stuartcarnie/8e2b6ee117d320c4e5045deb947ba824#using-these-optimizations-in-go"&gt;&lt;/a&gt;Using These Optimizations in Go&lt;/h3&gt;
If the AVX2 version of the function is called on a processor that does not support these extensions, your program will crash, which isn't ideal. The solution to this is to determine what processor features are available at runtime and call the appropriate function, falling back to the pure Go version, if necessary. The Go runtime does this in a number of places, using the &lt;a href="https://github.com/golang/go/tree/master/src/internal/cpu"&gt;internal/cpu package&lt;/a&gt; and we took a similar approach with some improvements. At startup, the most efficient functions are selected based on available processor features. However, if an environment variable named &lt;code class="language-markup"&gt;INTEL_DISABLE_EXT&lt;/code&gt; is present, disable any of the specified optimizations. If this is of interest to you, we've documented the feature in the repository. For example, to disable AVX2 and use the next best set of features for a hypothetical application, &lt;code class="language-markup"&gt;myapp&lt;/code&gt;:
&lt;pre class="language-markup"&gt;&lt;code class="language-markup"&gt;$ INTEL_DISABLE_EXT=AVX2 myapp&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;There is still plenty of work to be done to reach feature parity with the C++ implementation of Apache Arrow and we look forward to sharing our future contributions.&lt;/p&gt;
</description>
      <pubDate>Thu, 22 Mar 2018 03:36:31 -0700</pubDate>
      <link>https://www.influxdata.com/blog/influxdata-apache-arrow-go-implementation</link>
      <guid isPermaLink="true">https://www.influxdata.com/blog/influxdata-apache-arrow-go-implementation</guid>
      <category>Product</category>
      <category>Use Cases</category>
      <category>Developer</category>
      <category>Company</category>
      <author>Stuart Carnie (InfluxData)</author>
    </item>
    <item>
      <title>Logging Improvements for InfluxDB 1.5.0</title>
      <description>&lt;p&gt;When it comes to troubleshooting issues, log files are a high-valued asset. If things go wrong, you’ll almost always be asked to “send the logs”. InfluxDB 1.5 comes with a number of improvements to logging in an effort to simplify the task of analyzing this data.&lt;/p&gt;
&lt;h2&gt;Logging&lt;/h2&gt;
&lt;p&gt;InfluxDB generates a lot of log output that chronicles many aspects of its internal operation. This latest update has revamped how the database logs key information and the format of the log output. The primary goal of these changes is to enable tooling which can efficiently parse and analyze the log data and reduce the time required to diagnose issues.&lt;/p&gt;
&lt;h2&gt;Structured Logging&lt;/h2&gt;
&lt;p&gt;InfluxDB 1.5 can generate structured logs in either &lt;a href="https://brandur.org/logfmt"&gt;logfmt&lt;/a&gt; or &lt;a href="http://jsonlines.org/"&gt;JSON lines&lt;/a&gt;. This feature is found in a new section of the configuration file titled &lt;code class="language-markup"&gt;[logging]&lt;/code&gt;. The default log format is &lt;code class="language-markup"&gt;auto&lt;/code&gt;, which determines the output based on whether &lt;code class="language-markup"&gt;stderr&lt;/code&gt; refers to a terminal (TTY) or not. If &lt;code class="language-markup"&gt;stderr&lt;/code&gt; is a TTY, a less verbose “console” format is selected; otherwise, the output will be &lt;code class="language-markup"&gt;logfmt&lt;/code&gt;. If you would prefer consistent output, we would encourage you to explicitly set the format to &lt;code class="language-markup"&gt;logfmt&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The value of structured logging is greatly improved when specific elements of a log event can be extracted easily. With that in mind, the existing logging code was reviewed to ensure notable data was moved to separate keys. The following example shows a few startup events related to opening files in version 1.3. Note some entries include the path and duration embedded in the message:&lt;/p&gt;
&lt;pre class="language-markup"&gt;&lt;code class="language-markup"&gt;[I] 2018-03-03T00:46:48Z /Users/stuartcarnie/.influxdb/data/db/autogen/510/000000001-000000001.tsm (#0) opened in 280.401827ms engine=tsm1 service=filestore
[I] 2018-03-03T00:46:48Z reading file /Users/stuartcarnie/.influxdb/wal/db/autogen/510/_00001.wal, size 15276 engine=tsm1 service=cacheloader
[I] 2018-03-03T00:46:54Z /Users/stuartcarnie/.influxdb/data/db/autogen/510 opened in 5.709152422s service=store&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;As of 1.5, the same events using &lt;code class="language-markup"&gt;logfmt&lt;/code&gt; now look like:&lt;/p&gt;
&lt;pre class="language-markup"&gt;&lt;code class="language-markup"&gt;ts=2018-03-03T00:48:57.246371Z lvl=info msg="Opened file" log_id=06bLAgTG000 engine=tsm1 service=filestore path=/Users/stuartcarnie/.influxdb/data/db/autogen/510/000000001-000000001.tsm id=0 duration=61.736ms
ts=2018-03-03T00:48:57.246590Z lvl=info msg="Reading file" log_id=06bLAgTG000 engine=tsm1 service=cacheloader path=/Users/stuartcarnie/.influxdb/wal/db/autogen/510/_00001.wal size=15276
ts=2018-03-03T00:49:01.023313Z lvl=info msg="Opened shard" log_id=06bLAgTG000 service=store trace_id=06bLAgv0000 op_name=tsdb_open path=/Users/stuartcarnie/.influxdb/data/db/autogen/510 duration=3841.275ms&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Take note that &lt;code class="language-markup"&gt;path&lt;/code&gt; and &lt;code class="language-markup"&gt;duration&lt;/code&gt; are now separate keys. Using a tool like &lt;a href="https://blog.heroku.com/hutils-explore-your-structured-data-logs"&gt;lcut&lt;/a&gt;, we can select specific keys (&lt;code class="language-markup"&gt;ts&lt;/code&gt;, &lt;code class="language-markup"&gt;msg&lt;/code&gt;, &lt;code class="language-markup"&gt;path&lt;/code&gt; and &lt;code class="language-markup"&gt;duration&lt;/code&gt;) to reformat the output:&lt;/p&gt;
&lt;pre class="language-markup"&gt;&lt;code class="language-markup"&gt;2018-03-03T00:48:57.246371Z  Opened   file   /Users/stuartcarnie/.influxdb/data/db/autogen/510/000000001-000000001.tsm  61.736ms
2018-03-03T00:48:57.246590Z  Reading  file   /Users/stuartcarnie/.influxdb/wal/db/autogen/510/_00001.wal
2018-03-03T00:49:01.023313Z  Opened   shard  /Users/stuartcarnie/.influxdb/data/db/autogen/510                          3841.275ms&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Collating Log Events&lt;/h2&gt;
&lt;p&gt;During the lifetime of an InfluxDB process, operations such as compactions run continuously and generate multiple events as they advance. To further complicate matters, when these operations run concurrently, the events from each will be interleaved together. Determining the outcome of a specific compaction is practically impossible, as there is no way to associate log events for each independently. To address this issue, InfluxDB 1.5 adds the keys listed in the following table:&lt;/p&gt;
&lt;table width="683"&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;key&lt;/strong&gt;&lt;/td&gt;
&lt;td width="581"&gt;&lt;strong&gt;comment&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code class="language-markup"&gt;trace_id&lt;/code&gt;&lt;/td&gt;
&lt;td width="581"&gt;A unique value associated with each log event for a single run of an operation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code class="language-markup"&gt;op_name&lt;/code&gt;&lt;/td&gt;
&lt;td width="581"&gt;A searchable identifier, such as  &lt;code class="language-markup"&gt;tsm1.compact_group&lt;/code&gt; assigned to each operation type&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code class="language-markup"&gt;op_event&lt;/code&gt;&lt;/td&gt;
&lt;td width="581"&gt;Given a value of &lt;code class="language-markup"&gt;start&lt;/code&gt; or &lt;code class="language-markup"&gt;end&lt;/code&gt; to indicate whether this is the first or last event of the operation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code class="language-markup"&gt;op_elapsed&lt;/code&gt;&lt;/td&gt;
&lt;td width="581"&gt;The time an operation took to complete, in milliseconds. This key is always included with the &lt;code class="language-markup"&gt;op_event=end&lt;/code&gt; log event&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;To demonstrate how these keys might be used, we’ll use the following (abridged) file named “influxd.log”, which includes at least two compactions running concurrently.&lt;/p&gt;
&lt;pre class="language-markup"&gt;&lt;code class="language-markup"&gt;msg="TSM compaction (start)" trace_id=06avQESl000 op_name=tsm1_compact_group op_event=start
msg="Beginning compaction" trace_id=06avQESl000 op_name=tsm1_compact_group tsm1_files_n=2
msg="Compacting file" trace_id=06avQESl000 op_name=tsm1_compact_group tsm1_index=0 tsm1_file=/influxdb/data/db1/rp/10/000000859-000000002.tsm
msg="Compacting file" trace_id=06avQESl000 op_name=tsm1_compact_group tsm1_index=1 tsm1_file=/influxdb/data/db1/rp/10/000000861-000000001.tsm
msg="TSM compaction (start)" trace_id=06avQEZW000 op_name=tsm1_compact_group op_event=start
msg="Beginning compaction" trace_id=06avQEZW000 op_name=tsm1_compact_group tsm1_files_n=2
msg="invalid subscription token" service=subscriber
msg="Post http://kapacitor-rw:9092/write?consistency=&amp;amp;db=_internal&amp;amp;precision=ns&amp;amp;rp=monitor: dial tcp: lookup kapacitor-rw on 10.0.0.1: server misbehaving" service=subscriber
msg="Compacting file" trace_id=06avQEZW000 op_name=tsm1_compact_group tsm1_index=0 tsm1_file=/influxdb/data/db2/rp/12/000000027-000000002.tsm
msg="Compacting file" trace_id=06avQEZW000 op_name=tsm1_compact_group tsm1_index=1 tsm1_file=/influxdb/data/db2/rp/12/000000029-000000001.tsm
msg="Compacted file" trace_id=06avQEZW000 op_name=tsm1_compact_group tsm1_index=0 tsm1_file=/influxdb/data/db2/rp/12/000000029-000000002.tsm.tmp
msg="Finished compacting files" trace_id=06avQEZW000 op_name=tsm1_compact_group tsm1_files_n=1
msg="TSM compaction (end)" trace_id=06avQEZW000 op_name=tsm1_compact_group op_event=end op_elapsed=56.907ms
msg="Compacted file" trace_id=06avQESl000 op_name=tsm1_compact_group tsm1_index=0 tsm1_file=/influxdb/data/db1/rp/10/000000861-000000002.tsm.tmp
msg="Finished compacting files" trace_id=06avQESl000 op_name=tsm1_compact_group tsm1_files_n=1
msg="TSM compaction (end)" trace_id=06avQESl000 op_name=tsm1_compact_group op_event=end op_elapsed=157.739ms&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Compactions are identified by &lt;code class="language-markup"&gt;op_name=tsm1_compact_group&lt;/code&gt;, so to summarize them, we might use the following command to output the trace id and elapsed time:&lt;/p&gt;
&lt;pre class="language-markup"&gt;&lt;code class="language-markup"&gt;$ fgrep 'tsm1_compact_group' influxd.log | fgrep 'op_event=end' | lcut trace_id op_elapsed&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;which can be read as:&lt;/p&gt;
&lt;blockquote&gt;&lt;em&gt;Find all the lines containing the text&lt;/em&gt; &lt;em&gt;&lt;code class="language-markup"&gt;tsm1_compact_group&lt;/code&gt; &lt;/em&gt;and&lt;em&gt; &lt;code class="language-markup"&gt;op_event=end&lt;/code&gt; and display the &lt;code class="language-markup"&gt;trace_id&lt;/code&gt; and &lt;code class="language-markup"&gt;op_elapsed&lt;/code&gt; keys&lt;/em&gt;&lt;/blockquote&gt;
&lt;p&gt;and would produce the following output:&lt;/p&gt;
&lt;pre class="language-markup"&gt;&lt;code class="language-markup"&gt;06avQEZW000	56.907ms
06avQESl000	157.739ms&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;From here it is easy to filter the logs for trace &lt;code class="language-markup"&gt;06avQESl000&lt;/code&gt; using the following:&lt;/p&gt;
&lt;pre class="language-markup"&gt;&lt;code class="language-markup"&gt;$ fgrep '06avQESl000' influxd.log | lcut msg tsm1_file
TSM compaction (start)
Beginning compaction
Compacting file	/influxdb/data/db1/rp/10/000000859-000000002.tsm
Compacting file	/influxdb/data/db1/rp/10/000000861-000000001.tsm
Compacted file	/influxdb/data/db1/rp/10/000000861-000000002.tsm.tmp
Finished compacting files
TSM compaction (end)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We can ask more complex questions, such as:&lt;/p&gt;
&lt;blockquote&gt;&lt;em&gt;What are the top 10 slowest continuous queries?&lt;/em&gt;&lt;/blockquote&gt;
&lt;p&gt;using a command like:&lt;/p&gt;
&lt;pre class="language-markup"&gt;&lt;code class="language-markup"&gt;$ fgrep 'continuous_querier_execute' influxd.log | fgrep 'op_event=end' | lcut trace_id op_elapsed | sort -r -h -k2 | head -10
06eXrSJG000	15007.940ms
06d7Ow3W000	15007.646ms
06axkRVG000	15007.222ms
06ay9170000	15007.118ms
06c9tbwG000	15006.701ms
06dUcXhG000	15006.533ms
06ekMi40000	15006.158ms
06c5FH7l000	15006.145ms
06bDHhkG000	15006.012ms
06a~ioYG000	15005.988ms&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In summary, structured logging will enable us to analyze log data more efficiently with off-the-shelf tooling.&lt;/p&gt;
&lt;h2&gt;HTTP Access Logs&lt;/h2&gt;
&lt;p&gt;InfluxDB has long supported the ability to output HTTP request traffic in &lt;a href="https://en.wikipedia.org/wiki/Common_Log_Format"&gt;Common Log Format&lt;/a&gt;. This feature is enabled by setting the &lt;code class="language-markup"&gt;log-enabled&lt;/code&gt; option to &lt;code class="language-markup"&gt;true&lt;/code&gt; in the &lt;code class="language-markup"&gt;[http]&lt;/code&gt; section of the InfluxDB configuration.&lt;/p&gt;

&lt;p&gt;Prior to 1.5, all log output was sent to &lt;code class="language-markup"&gt;stderr&lt;/code&gt; and looked something like the following:&lt;/p&gt;
&lt;pre class="language-markup"&gt;&lt;code class="language-markup"&gt;[I] 2018-03-02T19:59:58Z compacted level 1 4 files into 1 files in 130.832391ms engine=tsm1
[I] 2018-03-02T20:00:09Z SELECT count(v0) FROM db.autogen.m0 WHERE time &amp;gt;= '2018-02-27T01:00:00Z' AND time &amp;lt; '2018-02-27T02:00:00Z' GROUP BY * service=query
[httpd] ::1 - - [02/Mar/2018:13:00:09 -0700] "GET /query?db=db&amp;amp;q=select+count%28v0%29+from+m0+where+time+%3E%3D+%272018-02-27T01%3A00%3A00Z%27+and+time+%3C+%272018-02-27T02%3A00%3A00Z%27+group+by+%2A HTTP/1.1" 200 0 "-" "curl/7.54.0" 4f39378e-1e54-11e8-8001-000000000000 726
[I] 2018-03-02T20:00:57Z retention policy shard deletion check commencing service=retention&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Intermingling the log event streams required additional work to separate them before we can begin any analysis. The latest release adds an additional configuration option to write the access log to a separate file. For example, the following configuration will write the access log to a file located at &lt;code class="language-markup"&gt;/var/log/influxd/access.log&lt;/code&gt;&lt;/p&gt;
&lt;pre class="language-markup"&gt;&lt;code class="language-markup"&gt;[http]
  # ...
  log-enabled = true
  access-log-path = "/var/log/influxd/access.log"
  # ...&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;When the access log is written to a file, the &lt;code class="language-markup"&gt;[http]&lt;/code&gt; prefix is stripped such that the file can be parsed by standard HTTP log analysis and monitoring tools without further processing. For example, using &lt;a href="http://lnav.org/"&gt;lnav&lt;/a&gt;, an admin can open an active log file and display the data using a number of different visualizations, including error rates and histograms as is demonstrated in the following &lt;a href="https://asciinema.org/a/thlhpQmHbEOdy4QOGL3zapgLa" target="_blank" rel="noopener noreferrer"&gt;asciinema.&lt;/a&gt;&lt;/p&gt;
</description>
      <pubDate>Tue, 06 Mar 2018 18:18:27 -0700</pubDate>
      <link>https://www.influxdata.com/blog/logging-improvements-for-influxdb-1-5-0</link>
      <guid isPermaLink="true">https://www.influxdata.com/blog/logging-improvements-for-influxdb-1-5-0</guid>
      <category>Product</category>
      <category>Developer</category>
      <category>Company</category>
      <author>Stuart Carnie (InfluxData)</author>
    </item>
  </channel>
</rss>
