InfluxData Blog - Chunchun Ye

System Tables Part 2: How We Made It Faster

Chunchun Ye (InfluxData) — Thu, 31 Oct 2024 08:00:00 +0000

In the first post, we introduced system tables and how to use them to inspect your cluster. In this follow up, we’ll explain some techniques to improve the speed of system table queries.

7. The problem

Before August 2024, querying system tables in Cloud Dedicated, particularly system.tables, system.partitions, and system.partitions, often took a long time to run, even with filters applied. In some cases, queries timed out without returning any result.

8. Why it was slow

To understand the causes of slow queries, let’s first look at how system table data is generated.

Below is a simplified overview of the InfluxDB 3 architecture (referred to from the blog InfluxDB 3: System Architecture). The central component, Catalog¹, stores metadata about databases, tables, columns, and file details like file size, location, created time, etc. On the right side, we have Querier, where the system tables reside and queries are executed. Figure 1: InfluxDB 3 Architecture

The metadata in the Catalog is organized following the simplified data model shown below (Figure 2) : Figure 2: Simplified Catalog Metadata Data Model

As the InfluxDB Catalog serves as the central cluster coordinator, it provides a restricted interface that does not permit analytic-style queries, for performance and stability reasons. Because of this, the Querier must make multiple calls to the Catalog to gather the information required to fill the system tables. Figure 3 illustrates what a typical data flow looks like when querying a system table like system.partitions.

system.partitions table has a column that shows the total size of a partition in megabytes. To compute this, the Querier first gets all the tables in the database, then retrieves all partitions for each table, and looks up every Parquet file associated with each partition. After that, it adds up the file sizes and converts the total to megabytes. This process involves several requests between the Querier and the Catalog, as it requires multiple data retrievals, as shown in Figure 3. Figure 3: Data Flow Between Querier and Catalog

8.1 The Performance Issue

Previously, when querying system tables like system.tables, system.partitions, or system.compactor, the Catalog would scan all the metadata and send it to the Querier in gRPC format. The Querier would then convert the responses to Arrow record batches, and run the query using DataFusion, which would apply the filters to discard irrelevant data.

This meant that queries like these:

SELECT * FROM system.partitions;
SELECT * FROM system.partitions WHERE table_name = 'foo';

Both did the same amount of work in the Catalog and Querier, even though the second query has a filter. The system scanned everything, applying the filter at the Querier level, leading to unnecessary overhead and slow performance.

9. The solution

9.1 Predicate Pushdown

A predicate is a condition used to filter data in a query, such as table_name = ‘foo’ or age > 20. We implemented predicate pushdown, a common database optimization technique, moving (or “pushing down”) certain filters (predicates) as close as possible to the data source (i.e., within the Catalog), as shown in Figure 4. This change reduced the amount of data fetched and transmitted to the Querier, which in turn reduced the workload on Querier. Figure 4: Predicate Pushdown

Before we implemented predicate pushdown, a query like SELECT * FROM system.partitions WHERE table_name = 'foo' fetched and formatted partition information for all tables in the Catalog, and the query engine promptly threw away everything except for foo.

After predicate pushdown, the querier avoids fetching and formatting partition information that it determines will be filtered out during query execution.

9.2 Multiple Predicates Pushdown

In addition to supporting single predicate pushdown, we extended support to handle multiple filters. The simple examples in Section 9.1 are likely obvious, but in a real system, users can provide arbitrary predicates connected by AND, OR, IN, etc., and it is non-trivial to determine which predicates to push down. For example, consider a query like:

SELECT * 
FROM system.partitions 
WHERE (table_name = 'foo' OR table_name = 'bar')
  AND (partition_key = '2024-10|device-101' OR partition_key = '2024-09|device-101')

We used DataFusion’s LiteralGuarantee::analyze to parse and simplify query predicates before pushing them down to the Catalog. This method has saved us a lot of engineering time and effort. A big shout-out to the DataFusion community for making this so simple to use!

The current implementation supports multiple predicates pushdown for filters such as table_name combined with partition_key or table_name combined with partition_id, further reducing the amount of data processed by the Querier.

9.3 Concurrent Data Fetching

Previously, the Querier made sequential API calls to the Catalog, where each request had to wait for the previous one to complete before proceeding. This added significant latency, especially when querying large datasets.

We improved this by enabling concurrent API requests, allowing the Querier to make multiple requests simultaeneously This greatly reduced the time needed to gather all necessary metadata.

10. Performance improvements

Here is how much faster querying system tables become when using the filter WHERE table_name:

system.tables: 17% faster
system.partitions: 65% faster
system.compactor: 60% faster

These improvements are based on a database with 100 tables, over 200 partitions, and more than 3,000 parquet files. If your database has more tables, partitions, or Parquet files, you’ll see even more significant performance gains with filtered queries.

Additionally, the average query latency with multiple filters is around 20 ms for queries like:

WHERE table_name = '...' AND partition_key = '...'
WHERE table_name = '...' AND partition_id = ...

11. Wrapping up

In this post, we explained how we improved the performance of system table queries by implementing predicate pushdown, optimizing the use of multiple filters, and enabling concurrent data fetching. These changes dramatically reduced query times, especially for databases with large amounts of metadata.

With these optimizations, you can retrieve relevant system data more efficiently and eliminate long debugging waits.

References:

Note that in our actual deployments, there are several layers of caching that are not reflected in Figure 1.

System Tables Part 1: Introduction and Best Practices

Chunchun Ye (InfluxData) — Tue, 29 Oct 2024 08:00:00 +0000

As an InfluxDB Cloud Dedicated or Clustered user, you may want to inspect your cluster to gain a better understanding of the size of your databases, tables, partitions, and compaction status. InfluxDB stores this essential metadata in system tables (described in Section 1), which help inform decisions about cluster performance and maintenance.

1. What are system tables?

System tables are “virtual” tables that present metadata for a specific database and provide insights into database storage. Each system table is scoped to a particular database and is read-only, meaning it cannot be modified.

System tables are hidden by default, as high-frequency access to these tables can interfere with the ongoing operations of the database. Thus, querying system tables requires a special debug header with the request. Once the debug header is added (described in Section 2), you can query system tables using SQL, similar to any other table in InfluxDB.

Here are the system tables that InfluxDB provides:

+---------------+--------------------+-------------+------------+
| table_catalog | table_schema       | table_name  | table_type |
+---------------+--------------------+-------------+------------+
| ...           | ...                | ...         | ...        |
| public        | system             | compactor   | BASE TABLE |
| public        | system             | partitions  | BASE TABLE |
| public        | system             | queries     | BASE TABLE |
| public        | system             | tables      | BASE TABLE |
| ...           | ...                | ...         | ...        |
+---------------+--------------------+-------------+------------+

In this blog, we will focus on three tables:

system.tables
system.partitions
system.compactor

Table	Description	Schema
`system.tables`	Contains information about tables, such as table name and partition template in the specific database.	Link
`system.partitions`	Contains information about partitions, partition sizes, file count, etc.	Link
`system.compactor`	Contains detailed information about compacted partitions at different compaction levels.	Link

Warning: System tables are not part of InfluxDB’s stable API. They are subject to change, and compatibility is not guaranteed.

Warning: Querying system tables may impact write and query performance. Use them only for debugging purposes and use filters to optimize queries and minimize their impact on your cluster.

2. Accessing system tables

To access system tables, you must provide a debug header with the request. The specific commands to add this header vary depending on the client you are using.

influxctl CLI

For influxctl, set the --enable-system-tables header:

influxctl query \
  --enable-system-tables \
  --database DATABASE_NAME \
  --token DATABASE_TOKEN \
  "SQL_QUERY"

Arrow Flight SQL or other client libraries

For Arrow Flight SQL or other client libraries, such as Go and Python, set the iox-debug header to true.

3. Querying system tables: Examples

1. View the partition template of a specific table

SELECT  *  FROM  system.tables  WHERE  table_name  =  'TABLE_NAME'

Example Result:

+-----------------+--------------------------------------------------------+
| table_name      | partition_template                                     |
+-----------------+--------------------------------------------------------+
| your_table_name | {"parts":[{"timeFormat":"%Y-%m"},{"tagValue":"col1"}]} |
+-----------------+--------------------------------------------------------+

If a table doesn’t include a partition template in the output of this command, the table uses the default (1 day) partition strategy and doesn’t partition by tags.

2. View the number of partitions and total size per table

SELECT
  table_name,
  COUNT(*) AS partition_count,
  SUM(total_size_mb) AS total_size_mb
FROM system.partitions
WHERE table_name IN ('foo', 'bar', 'baz')
GROUP BY table_name

Example Result:

+------------+-----------------+---------------+
| table_name | partition_count | total_size_mb |
+------------+-----------------+---------------+
| foo        | 1               | 2             |
| bar        | 4               | 5             |
| baz        | 10              | 23            |
+------------+-----------------+---------------+

3. View the size for different levels of compacted files*

SELECT
  table_name,
  SUM(total_l0_files) AS l0_files,
  SUM(total_l1_files) AS l1_files,
  SUM(total_l2_files) AS l2_files,
  SUM(total_l0_bytes) AS l0_bytes,
  SUM(total_l1_bytes) AS l1_bytes,
  SUM(total_l2_bytes) AS l2_bytes
FROM system.compactor
WHERE table_name IN ('foo', 'bar', 'baz')
GROUP BY table_name

*Compacted files are compressed Parquet files processed by the Compactor to optimize storage. These files have different compaction levels: L0, L1, and L2. L0, or “Level 0”, represents newly ingested, uncompacted small files, while L2, or “Level 2”, represents compacted, non-overlapping files.

Example Result:

+------------+----------+----------+----------+----------+----------+----------+
| table_name | l0_files | l1_files | l2_files | l0_bytes | l1_bytes | l2_bytes |
+------------+----------+----------+----------+----------+----------+----------+
| foo        | 0        | 1        | 0        | 0        | 20659    | 0        |
| bar        | 0        | 1        | 0        | 0        | 7215     | 0        |
| baz        | 0        | 1        | 0        | 0        | 10784    | 0        |
+------------+----------+----------+----------+----------+----------+----------+

4. Optimize queries to reduce cluster impact

Querying system tables can degrade the performance of other common queries, especially if you are trying to view every detail in clusters with hundreds of tables, hundreds of thousands of partitions, and millions of Parquet files.

To reduce the performance impact, we suggest selecting information for a specific table or a particular partition by adding filters as follows:

WHERE table_name = '...'
WHERE table_name = '...' AND partition_key = '...'
WHERE table_name = '...' AND partition_id = ...
WHERE partition_id = ...

See documents on how to obtain partition_key and partition_id.

5. Use the most efficient filters

Among the above filters, the following filters are specially optimized and significantly reduce query latency to about 20 ms, even on our largest clusters:

WHERE table_name = '...' AND partition_key = '...'
WHERE table_name = '...' AND partition_id = ...

6. Bringing it home

In this first post, we introduced system tables, explained how to access them, and discussed how to optimize your queries with filters. In the next post, we will explain how we improved the performance of system tables.

References:

Querying InfluxDB 3.0 Using JDBC Driver for Tableau

Chunchun Ye (InfluxData) — Fri, 23 Jun 2023 07:35:00 +0000

InfluxDB 3.0 now offers support for connecting Tableau to InfluxDB 3.0 to query data for visualization using the Apache Arrow Flight SQL JDBC driver (Flight SQL driver). In this blog post, we will explore the capabilities and benefits of this integration and provide some instructions on how to connect them.

Background

Tableau offers two main ways to connect to third-party data sources: file and server. This article describes how to connect Tableau to Parquet files. By contrast, this blog describes how to connect Tableau to an InfluxDB 3.0 database through the Flight SQL driver.

What is Apache Arrow Flight SQL JDBC Driver?

JDBC (Java Database Connectivity) is a standard way to connect to, and interact with a database. The Flight SQL driver is a JDBC driver implementation that utilizes the underlying Flight SQL protocol, allowing any program that connects via JDBC to seamlessly connect and interact with databases that support Flight SQL. Because InfluxDB 3.0 supports Flight SQL, this driver acts as a bridge, enabling Tableau to establish a connection with InfluxDB 3.0, execute queries, and retrieve time series data for visualization.

Benefits of connecting Tableau to InfluxDB 3.0 using the JDBC Driver

Enhanced performance: The Apache Arrow Flight SQL JDBC driver leverages the high-performance capabilities of Apache Arrow and the efficient data transfer protocol of Arrow Flight. This results in faster data transfers and improved query execution times, enabling Tableau users to retrieve and visualize data from InfluxDB 3.0 with exceptional speed and efficiency.
Real-time visualizations: By connecting Tableau to InfluxDB 3.0 using the JDBC driver, users can create interactive visualizations in real-time, as soon as InfluxDB ingests their data.
Seamless integration: The Apache Arrow Flight SQL JDBC driver provides a standardized interface for connecting Tableau to InfluxDB 3.0. This means that Tableau users can leverage their existing knowledge and skills to connect to InfluxDB 3.0 without the need for complex data transformations or additional tools.

Who can use this feature

InfluxDB 3.0 users, both InfluxDB Cloud Serverless and InfluxDB Cloud Dedicated, have access to this feature on Tableau Desktop as of publication. In the future, additional InfluxDB 3.0 products will also support the JDBC driver.

How to connect Tableau to InfluxDB 3.0

You can find instructions for connecting Tableau to InfluxDB 3.0 Cloud Dedicated or Cloud Serverless in our docs.

Unlocking real-time insights

The Flight SQL JDBC driver integration between InfluxDB 3.0 and Tableau presents exciting opportunities for visualizing and analyzing time series data in real-time with enhanced performance. By effortlessly connecting these two powerful tools, users can unlock real-time insights, streamline workflows, and create compelling visualizations and reports. The combination of InfluxDB 3.0’s time series capabilities and Tableau’s analytical prowess empowers organizations to make data-driven decisions efficiently and effectively.