Flux (formerly IFQL) and the Future of InfluxData
With the big news coming out today about our Series C financing, I thought I should take some time to talk about what the future of InfluxData platform and InfluxDB holds. For our CEO Evan Kaplan’s perspective see here.
To understand the future direction, I should talk a little bit about the last five years of work and what we’ve learned in the process. In the beginning of the company I built a “time series API” with web services written in Scala using Cassandra as a backing data store and Redis as a caching and indexing layer. The initial version of this API was RESTful. About six months later, I implemented this as a single binary in Go using LevelDB as the underlying storage engine, but kept the same RESTful API with some key additions.
While that original application didn’t live very long, we did see a pattern. Many organizations and startups were repeating the same effort to create a time series API for their applications and monitoring needs. Seeing this common repeated effort drove us a year later in 2013 to open source this infrastructure as a new open source project: InfluxDB. We also used the fresh start to change the API to something using a query language that looks much like SQL. Its key advantage at the time was that it looked familiar to developers making the learning curve easy.
The Importance of the Platform
In the early days we were just building the database, but we saw users adopting the technology and repeating effort to solve a common set of problems. Given our success with the database to solve a common set of user problems, we developed the vision to create an entire platform for working with time series data. We’d not only help developers store and query it, but we’d also help them collect it, process and monitor it, and visualize it. Our view then and now was that if we built the entire platform around time series data, we’d enable developers to build their solutions much faster (what I call Time to Awesome). Developers don’t want to be shaving infrastructure and architecture yaks. They want to build their apps and focus on the business problems they’re trying to solve with code.
So in 2014 I raised the Series A round with the intention of hiring a team to build out the other components of the platform. First we built Telegraf (the data collector), then we built Kapacitor (the processing and monitoring agent), and finally we built out Chronograf (the UI and visualization engine). We named it the TICK Stack as an homage to individual points in a financial time series, each referred to as a tick. But the TICK Stack is an artifact of how this company was built over time. The overall vision is not to build four separate applications, but to create a platform that helps developers solve problems related to time series data.
The Importance of the Data Language
The SQL like language that InfluxDB currently offers is also an artifact of the evolution of database development over the last 10 years. As developers, we went from SQL to NoSQL to NewSQL and now it seems we’re back to SQL.
However, I think there are opportunities to create new ways of working with time series data that aren’t tied to SQL. While SQL is a great tool, it’s not the ONLY tool. Time series data aren’t sets and mostly aren’t relational. They’re more frequently accessed and processed like matrices, which could be better queried, processed and analyzed in a language other than SQL. So we’ve started building a new query language specifically designed for working with time series data called IFQL [later renamed Flux]. As a functional language it allows users to define complex queries through a set of functional transformations on data. While we’re calling it a query language, it’s quickly becoming much more than that, allowing you to do complex analytics within the language itself. It also lets users recompose parts of query functions with user defined functions, letting them create shortcuts for common functionality. Users will be able to build Flux up around their problem domain. For more details on Flux, check out the slides from my talk at InfluxDays NYC today.
We also wanted to create a language that would make it easier for UI developers to create interesting builders and visualization tools. We expect that most users won’t even need to work with the query language to begin with, but use Chronograf, Grafana or some other user interface for interacting with their data. We won’t be locking down the Flux language design until we’ve shipped a query builder for it inside Chronograf. This will be to validate that the language makes it easy for UI designers to work with.
An API Driven Platform
We are going to unify the entire stack behind a common API. It will be a single entry point for developers exposed through gRPC, REST, and possibly, GraphQL. It will be designed to be multi-tenant from Day One with organizations, users, API tokens, and access rules. It will give the user the ability to not only write and query data, but also to define rules for collection, monitoring and alerting, background processing, meta-data associated with series, and user data like dashboards and annotations.
The language of the stack for working with data will be Flux while the API will unify the rest of the stack in one place. Some of this we’re already shipping as open source (like Flux). Other parts will be built and open sourced through the course of this year.
The future of InfluxData isn’t just as a Time Series Database. We’re creating a platform for working with time series data. And our goal is to make developers far more productive when creating applications around metrics and events. Whether you’re visualizing the data, collecting and scraping, monitoring and alerting, or transforming to send to other sources, my goal is that InfluxData will provide everything out of the box. Look for us to invest heavily to deliver on that goal and to continue to contribute great open source software to the community.