Why We're Building Flux, a New Data Scripting and Query Language

Navigate to:

Last month I gave a talk at InfluxDays London about Flux (#fluxlang), the new query and scripting language we’re building for InfluxDB 2.0. One of the more common questions I get when I talk about Flux is why? Why would you go to the trouble of creating a new language? Why not just use SQL? Or if you need an actual scripting language, why not use something that already exists and is embeddable like Lua or JavaScript? All of these questions, to me, are vaguely reminiscent of when we were creating InfluxDB in the first place. Why would you build a database for time series rather than just building on top of some general purpose database? These are all fair questions, and in this post, I’ll attempt to clarify our motivation for creating a new language.

When we first created InfluxDB, we started with a language, called InfluxQL, that looked like SQL. This served as an easy onramp for new users and looked somewhat familiar, so it made things comfortable at first. However, it isn’t quite SQL and is different in some key ways. Over time we found that this could be very frustrating for users that were SQL experts who expected it to behave in the same way and have all the functionality of SQL. For users that wanted more and more advanced query functionality, they would eventually run up against limitations in the language. With InfluxDB 2.0 we want to address all of these feature requests and give users something that was easy to learn and more powerful. This led us to the creation of Flux.

Before I get into the discussion of SQL vs. Flux, I want to lay out the design goals we have with the language. We’ll revisit these as I talk about some of the other alternatives we could have pursued. The Flux language should be:

  • Useable
  • Readable
  • Composable
  • Testable
  • Contributable
  • Shareable

When we say we want Flux to be “useable”, we mean that we want to optimize for programmer happiness and productivity. We want it to be easy to learn and use, highly productive and even fun. Arguments for programmer happiness win over language “purity”. It’s our first and highest priority followed quickly by “readability.” This also means that a REPL, powerful CLI to test scripts, and a nice web based UI for point-and-click building of scripts are first-class citizens to be included from the beginning.

Programmers read far more code than they ever write. Readability is also important for code that you write yourself, not just other people’s code. If you’ve ever been in a project and done a “git blame” on some line of code that you thought was stupid or inscrutable, only to find out that you were the one who wrote it a year ago, you know what I mean.

It should be “composable,” meaning that users should be able to build on top of the language towards their specific use cases and needs. You should be able to define functions and create entire libraries of them. The more you work with the language, the more you should be able to mold it around your problem domain.

It should be “testable.” Queries are code and they should be tested and checked into source control if they’re going to be a part of a long-lived application. Further, individual parts of queries should be testable in isolation. It should be possible for a user to build up a complicated query, but test each part of that separately.

Finally, we want Flux to be both easy to contribute to and easy to share functions and libraries of Flux code with other developers. We want to continue to add new functions to the language and continue to give users more functionality with less code for them to write. We also want to encourage community contributions to add new functions to the language, and we have intentionally structured things so new contributors can engage without knowing everything about the internals of the engine itself. We had great success in the design of our plugin system with Telegraf, and we’d like to duplicate that for new functions in the Flux engine. The design of Flux as a query language should be such that everyone can make these additions without having to wrestle with or change its semantics.

It’s horribly inefficient for developers to create the same queries over and over again. We want common queries and use cases represented and shared so that we stop re-inventing every individual query in InfluxDB. Because we have a common data collector (Telegraf) and a common schema, it should be possible to have reusable queries and functions built by the community. These could be definitions for monitoring and alerting, common ETL tasks, or integrations with third-party systems to share data or send notifications and alerts. Our users shouldn’t have to re-invent monitoring queries for common third-party services and systems.

Now that the high-level design principles are laid out, let’s talk about why we’re not just moving InfluxQL to conform to the SQL standard. There are clear advantages to going with this approach. SQL is known by many developers and there’s a large ecosystem of tools and libraries that are compatible with it. It’s the safer bet since it integrates with so many things and has an experienced developer base. However, I don’t think SQL is the best language for working with time series data. SQL is designed around relational algebra and working with sets. The semantics of the language don’t line up with the flow-based model I think of when thinking of time series data. Time series are a continuous stream upon which are applied functions, calculations and transformations. One function does something then sends its output to the next, which does something and so on. This functional style also makes it so that new functions can be introduced without changing the semantics of the language as a whole.

I don't want to live in a world where the best language humans could think of for working with data was invented in the 70's

SQL is a great and powerful tool, but it shouldn’t be the only tool. And to be honest, I don’t want to live in a world where the best language humans could think of for working with data was invented in the 70’s. I refuse to let that be my reality. But to make the leap into the 21st century meant creating something new, which also meant that it wouldn’t have the inertia of 40 years of development, education, and standardization. Let’s dig into some of the reasons and an example of why we think that creating Flux is worth taking the effort to build up a community around it over time.

For starters, we want to provide functionality that doesn’t currently exist in the SQL standard. Sure, there are extensions to SQL that include time series, but they’re not part of the standard. We actually want to create the new standard language for working with time series (or really any kind) of data. It’s why we decided to licence the Flux language and engine under a liberal MIT license. We want it to be pervasive, ubiquitous, and used by many projects.

We want a language that will enable developers and data scientists to push more workloads into the data platform layer. Developers shouldn’t have to write Python scripts to further process, shape, and refine their data before gaining insight from it. For example, we could be calculating similarities between series, doing string manipulations and modifications, creating rules for alerting and even pulling data from other sources. Shoehorning that kind of functionality into the SQL language would get ugly quickly. Sure, we could create a Turing complete SQL variant like Oracle’s PL/SQL or Microsoft’s T-SQL, but then it would be something that isn’t quite SQL anymore. That new functionality looks like something that was bolted on after the fact (which it was). That pure SQL language starts to look like a shanty-town built up over time.

Ideally, we’d have a language that has been designed from the ground up for not just declarative queries, but also for data processing and working with time series data specifically. Here’s a short example of what computing an exponential moving average for many time series looks like in Flux:

from(db:"telegraf")
  |> range(start:-1h)
  |> filter(fn: (r) => r._measurement == "foo")
  |> exponentialMovingAverage(size:-10s)

This example breaks the query up over multiple lines, but it could exist in a single line. A few things jump out from this code. First, the language is functional. The pipe-forward operator (|>) indicates that we’re sending the output from the function on the left-hand side to the function on the right-hand side. We can see that the functions take named parameters (to aid in readability). In the filter function, we see that anonymous functions can also be passed as parameters. The style should look very similar to Javascript, which is intentional. We wanted to create a language that looked and felt somewhat familiar.

Let’s step through what the function is doing. First, we’ve identified the database from which the data is extracted, and filtered down the set to only the last hour and only time series that have the measurement of “foo”. Finally, we’re sending each of those series to the exponentialMovingAverage function, which will compute it based on 10 second intervals (much like the Graphite function).

Writing a SQL equivalent example of that query is, at this point, beyond my SQL capabilities. I’ve forgotten more about SQL’s intricacies over the years as object relational mappers and NoSQL APIs have taken up more and more of my development time against databases. After a search, I found this example of computing a rolling average in SQL (note that I can’t actually find the original source because there are so many variants of this question online I can’t find my way back to this one):

select id, 
       temp,
       avg(temp) over (partition by group_nr order by time_read) as rolling_avg
from ( 
  select id, 
         temp,
         time_read, 
         interval_group,
         id - row_number() over (partition by interval_group order by time_read) as group_nr
  from (
    select id, 
    time_read, 
    'epoch'::timestamp + '900 seconds'::interval * (extract(epoch from time_read)::int4 / 900) as interval_group,
    temp
    from readings
  ) t1
) t2
order by time_read;

This is computing a rolling average for a single time series. Our Flux example did it for many different series. Rather than something like pipe-forward in Flux, which makes it possible to see things sequentially as a flow, the SQL query must use nested select statements to put the dataset together. This is kind of like the worst part of Lisp (nested function calls), but even less readable. Not only was the Flux example more terse, it was more readable and understandable.

Coming back to this idea of composability, a user should be able define functions and import code and functions created by other developers. SQL doesn’t have this functionality so we’d be resorting to tacking it on or implementing T-SQL, Stored Procedures, or something like that. Defining a function in Flux is very simple. For example, say we wanted to take a bunch of time series and have each value in the series be the square of itself:

square = (table=<-) => {
  table |> map(fn: (r) => r._value = r._value * r._value)
}

Here we’ve defined a function called “square” that takes a table (either as pipe-forwarded to the function or as the “table” argument). It then sends that table to the map function, which goes over every value and computes its square.

The functional style of the language makes it much easier to test parts of a query. From a user’s perspective, a function takes inputs, performs some calculation or transformation, and produces outputs. This makes it trivial to write extensive unit tests for any function defined in the language.

Ok, so we’ve decided to create a scripting language for InfluxDB 2.0 rather than implement SQL (note that we’ll continue to support InfluxQL and TICKscript). Why not use Lua or Javascript instead of creating something new? For starters, Lua isn’t in wide enough use to make the learning curve any easier for our user base. Javascript, while widely known and used, has many things in the language that exist as historical baggage—like some sort of prehensile tail yet to be shed. For our use case, we don’t need the entirety of these languages. Even though Flux will be Turing complete, we will be aggressive about limiting the syntax and always aim for readability and clarity. For instance, Flux only supports named parameters to functions. This makes code that calls a function more readable, whereas with positional parameters you end up having to look up a function definition to understand what is being passed in.

I’ll be writing more over the coming weeks and months about specific features and design considerations in the language to highlight examples where Flux gives users far more power than InfluxQL and where the style can give greater clarity for developers reading queries. I’ll also try to highlight some of the things that Flux will enable that are either impossible in the SQL standard or require significant SQL gymnastics to achieve. In the meantime, you can ask questions at the Flux section of our community site, or log issues with the area/query tag in the InfluxData Platform repo, or go take a look at the Flux language spec, or the Flux engine code (which is under the MIT license).

For more on the language, why, and the principles guiding our design, here’s the video of the talk I gave last month in London: