Webinar Highlight: Introducing InfluxDB’s New Time Series Database Engine
By Caitlin Croft / Mar 01, 2023 / InfluxDB IOx
As part of the InfluxDB Cloud, powered by IOx launch, Paul Dix and Balaji Palani provided an InfluxDB Cloud overview and demo. In case you missed it, this blog is a quick 5 minute read summarizing the webinar. We shared the recording and the slides from the presentation for everyone to review and watch at your leisure.
- InfluxDB Cloud, powered by IOx is currently available in two AWS regions - Frankfurt, Germany or Virginia, USA
- It is a columnar database written in Rust using the Apache Arrow ecosystem that supports SQL natively
- Includes native InfluxQL support & enables unlimited cardinality
- :36 | Paul Dix provides an overview and roadmap update for InfluxDB Cloud, powered by IOx
- 22:47 | Balaji Palani demos InfluxDB Cloud and Apache Superset visualizations
- 38:40 | Q&A time
InfluxDB Cloud, powered by IOx is a columnar database built on Apache Arrow with cloud environments in mind where the object storage and scalable compute layer are separate. The database uses object storage for persistence and Apache Parquet is the persistence format. Developers can use InfluxDB Cloud to query data in real time before it lands in object storage. It supports SQL natively by including a SQL query, parser, planner, and execution engine. InfluxDB Cloud uses Apache DataFusion as the query engine. Paul dives deeper into the differences between the two different storage engines (TSM and IOx), the benefits of using the Apache ecosystem, and schema design tips and tricks.
“InfluxDB Cloud, powered by IOx is the cloud columnar database that is optimized for time series, including workloads and queries.” - Paul Dix | Founder and CTO, InfluxData
Paul Dix discussed the cardinality problem, which occurs when having a high number of time series affects system performance and can impact the cost of data ingestion. Queries would slow down, especially if you wanted to run computations across millions of individual unique time series. InfluxDB Cloud, powered by IOx’s underlying design and architecture removes this limitation.
IOx’s capabilities enable our team to develop a bunch of features in the coming years. Community members will be able to scale the compute layer up and down dynamically without much manual work. It will be faster for developers to upload historical data in bulk. We addressed the needs of those collecting high precision data at the edge who don’t need their data in a centralized store for real-time querying, but need it for historical analysis:
“Our goal is to provide a place for developers to send all of the data, metrics, events, traces and log data….InfluxDB Cloud, powered by IOx is the ideal place to ingest data in real time and build automation systems on top of it, and be able to do historical analysis on it.” - Paul Dix | Founder and CTO, InfluxData
Paul Dix believes Rust is essential for the future of system software. The Rust compiler safeguards against data erasures and other potential bugs. InfluxData’s developers are looking forward to taking advantage of how embeddable Rust is into other systems and languages.
“It gives you fine-grain control over memory, the safety of a higher-level language, and has a great model for concurrent applications.” - Paul Dix | Founder and CTO, InfluxData
The team sees Apache Arrow Flight SQL as the new standard for database systems to transfer data between clients and servers. Andrew Lamb is part of our engineering team and is the Chair of Apache Arrow’s Project Management Committee (PMC). Apache Arrow has been quickly adopted by data scientists and by data warehousing and big data engineers. Paul dives into the importance of using Apache Parquet as the persistent format and Apache Arrow Flight SQL. We even built a Flight SQL plugin for Grafana that enables users to build reports and dashboards common in traditional BI tools. There are many more integrations coming soon to InfluxDB Cloud!
“Our goal is to make Flight SQL the standard for larger database ecosystem vendors and make it easy for third-party tool developers to adopt.” - Paul Dix | Founder and CTO, InfluxData
Watch Balaji Palani, VP of Product Marketing, provide a demo of InfluxDB Cloud, powered by IOx. Balaji demoed the new Data Explorer where developers can write SQL queries and review schemas. Balaji shows how to visualize your time-stamped data in Apache Superset and how you can combine metrics and traces to create dashboards. Developers can now collect metrics, events, and traces in a single data store with InfluxDB. Check out the InfluxDB Observability repo.
Attendees had lots of questions. Here are some of the most asked questions and their answers.
Question: What are the benefits to using InfluxQL over SQL?
Paul: Whatever is in SQL will be a superset of what’s available in InfluxQL, as InfluxQL uses the same underlying engine. We’ve heard from people over the years that for some basic time series queries, they find InfluxQL easier to use. The primary benefit for existing users is that InfluxDB Cloud, powered by IOx will support InfluxQL natively; there will be a translation layer that will expose the InfluxDB V. 1 query API layer. You’ll be able to submit a query as though it were in InfluxDB V. 1 with InfluxQL. It will execute and return your results in the same format that InfluxQL 1 did, which means, if you’re using third-party tools like Grafana, you’ll just be able to interact with it as though it’s an InfluxDB V1 database without having to rewrite your dashboards.
Our goal is to make it as close to the original InfluxQL as possible, while also taking advantage of the database’s new performance benefits. Many queries will be orders of magnitude faster on IOx, than they will be on the traditional InfluxQL engine. And obviously, requests for better performance have been ongoing for the entirety of the project.
Question: Is Flux supported in InfluxDB Cloud, powered by IOx?
Paul: Flux is enabled in InfluxDB Cloud’s API, but it is not in the user interface. For people who want to use Flux, they can do so in the API. We’re currently pushing SQL in the UI. Once native InfluxQL support is added, we will likely enable it in the UI, too. Flux isn’t just a query language — it’s an entire scripting language and we currently don’t have the bandwidth to add it into a Rust-native implementation. Unfortunately, it means a lot of the query optimizations that happen within influxDB Cloud, powered by IOx are not available in Flux. Flux is essentially acting as a scripting client that’s pulling back a bunch of data and doing some things. For the best performance, we’re encouraging people to use SQL or, when it comes out, InfluxQL.
Question: Now that it’s written in Rust, are there any changes to bucket creation or data ingestion?
Paul: The API for creating buckets is the same. If you signed up for InfluxDB Cloud prior to the launch, you will be using TSM, the previous storage system. Within your account, you can create a new organization which will be IOx-based. In the coming months, we will be migrating existing InfluxDB Cloud accounts from TSM to IOx. We aim to have all users upgraded to InfluxDB Cloud, powered by IOx by the end of year.
Question: Will it be possible to write InfluxDB tasks in SQL?
Question: What is the product timeline for InfluxDB OSS and InfluxDB Enterprise?
Paul: The team is primarily focused on building more InfluxDB Cloud features. We’re planning to have cloud-dedicated clusters available in late April. By August, we aim to update InfluxDB OSS and InfluxDB Enterprise. Stay tuned for more updates from the team.
To check out the full webinar and listen to the rest of the Q&A, click here!