<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <title>InfluxData Blog - Chunchun Ye</title>
    <description>Posts by Chunchun Ye on the InfluxData Blog</description>
    <link>https://www.influxdata.com/blog/author/chunchun-ye/</link>
    <language>en-us</language>
    <lastBuildDate>Thu, 31 Oct 2024 08:00:00 +0000</lastBuildDate>
    <pubDate>Thu, 31 Oct 2024 08:00:00 +0000</pubDate>
    <ttl>1800</ttl>
    <item>
      <title>System Tables Part 2: How We Made It Faster</title>
      <description>&lt;p&gt;In the &lt;a href="https://www.influxdata.com/blog/system-tables-intro-part-one-influxdb/?utm_source=website&amp;amp;utm_medium=direct&amp;amp;utm_campaign=system_tables_part_two_influxdb&amp;amp;utm_content=blog"&gt;first post&lt;/a&gt;, we introduced system tables and how to use them to inspect your cluster. In this follow up, we’ll explain some techniques to improve the speed of system table queries.&lt;/p&gt;

&lt;h2 id="the-problem"&gt;7. The problem&lt;/h2&gt;

&lt;p&gt;Before August 2024, querying system tables in Cloud Dedicated, particularly &lt;code class="language-markup"&gt;system.tables&lt;/code&gt;, &lt;code class="language-markup"&gt;system.partitions&lt;/code&gt;, and &lt;code class="language-markup"&gt;system.partitions&lt;/code&gt;, often took a long time to run, even with filters applied. In some cases, queries timed out without returning any result.&lt;/p&gt;

&lt;h2 id="why-it-was-slow"&gt;8. Why it was slow&lt;/h2&gt;

&lt;p&gt;To understand the causes of slow queries, let’s first look at how system table data is generated.&lt;/p&gt;

&lt;p&gt;Below is a simplified overview of the InfluxDB 3 architecture (referred to from the blog &lt;a href="https://www.influxdata.com/blog/influxdb-3-0-system-architecture/?utm_source=website&amp;amp;utm_medium=direct&amp;amp;utm_campaign=system_tables_part_two_influxdb&amp;amp;utm_content=blog"&gt;InfluxDB 3: System Architecture&lt;/a&gt;). The central component, &lt;strong&gt;Catalog&lt;sup&gt;1&lt;/sup&gt;&lt;/strong&gt;,  stores metadata about databases, tables, columns, and file details like file size, location, created time, etc. On the right side, we have &lt;strong&gt;Querier&lt;/strong&gt;, where the system tables reside and queries are executed.
&lt;img src="//images.ctfassets.net/o7xu9whrs0u9/277428e2975345dcaeda6ebc585e4294/9f06b603502c5c14f6cca25f785a4148/unnamed.png" alt="" /&gt;
Figure 1: InfluxDB 3 Architecture
  &lt;br /&gt;&lt;/p&gt;

&lt;p&gt;The metadata in the Catalog is organized following the simplified data model shown below (Figure 2) :
&lt;img src="//images.ctfassets.net/o7xu9whrs0u9/9abbf368fbfe41a382833cc9f56b6847/f6b193c1e005e4952cdef746bd508d2c/unnamed.png" alt="" /&gt;
Figure 2: Simplified Catalog Metadata Data Model&lt;/p&gt;

&lt;p&gt;As the InfluxDB Catalog serves as the central cluster coordinator, it provides a restricted interface that does not permit analytic-style queries, for performance and stability reasons. Because of this, the Querier must make multiple calls to the Catalog to gather the information required to fill the system tables. Figure 3 illustrates what a typical data flow looks like when querying a system table like &lt;code class="language-markup"&gt;system.partitions&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code class="language-markup"&gt;system.partitions&lt;/code&gt; table has a column that shows the total size of a partition in megabytes. To compute this, the Querier first gets all the tables in the database, then retrieves all partitions for each table, and looks up every Parquet file associated with each partition. After that, it adds up the file sizes and converts the total to megabytes. This process involves several requests between the Querier and the Catalog, as it requires multiple data retrievals, as shown in Figure 3. 
&lt;img src="//images.ctfassets.net/o7xu9whrs0u9/93b75a105bc94dea96f5f9ecb667706d/a2dfc6cfcce8839981a04d046f8c6320/unnamed.png" alt="" /&gt;
Figure 3: Data Flow Between Querier and Catalog&lt;/p&gt;

&lt;h3 id="the-performance-issue"&gt;8.1 The Performance Issue&lt;/h3&gt;

&lt;p&gt;Previously, when querying system tables like &lt;code class="language-markup"&gt;system.tables&lt;/code&gt;, &lt;code class="language-markup"&gt;system.partitions&lt;/code&gt;, or &lt;code class="language-markup"&gt;system.compactor&lt;/code&gt;, the Catalog would scan all the metadata and send it to the Querier in gRPC format. The Querier would then convert the responses to &lt;a href="https://arrow.apache.org/docs/dev/r/reference/RecordBatch.html"&gt;Arrow record batches&lt;/a&gt;, and run the query using &lt;a href="https://datafusion.apache.org/"&gt;DataFusion&lt;/a&gt;, which would apply the filters to discard irrelevant data.&lt;/p&gt;

&lt;p&gt;This meant that queries like these:&lt;/p&gt;

&lt;pre class="line-numbers"&gt;&lt;code class="language-sql"&gt;SELECT * FROM system.partitions;
SELECT * FROM system.partitions WHERE table_name = 'foo';&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Both did the &lt;strong&gt;same amount of work&lt;/strong&gt; in the Catalog and Querier, even though the second query has a filter. The system scanned everything, applying the filter at the Querier level, leading to unnecessary overhead and slow performance.&lt;/p&gt;

&lt;h2 id="the-solution"&gt;9. The solution&lt;/h2&gt;

&lt;h3 id="predicate-pushdown"&gt;9.1 Predicate Pushdown&lt;/h3&gt;

&lt;p&gt;A predicate is a condition used to filter data in a query, such as &lt;code class="language-markup"&gt;table_name = ‘foo’&lt;/code&gt; or &lt;code class="language-markup"&gt;age &amp;gt; 20&lt;/code&gt;. We implemented &lt;strong&gt;predicate pushdown&lt;/strong&gt;, a common database optimization technique, moving (or “pushing down”) certain filters (predicates) as close as possible to the data source (i.e., within the Catalog), as shown in Figure 4. This change reduced the amount of data fetched and transmitted to the Querier, which in turn reduced the workload on Querier. 
&lt;img src="//images.ctfassets.net/o7xu9whrs0u9/d05d570e798a4c4d8479934ba9db209d/4ed99fc34f6a614d20324e0befe3f88b/unnamed.png" alt="" /&gt;
Figure 4: Predicate Pushdown&lt;/p&gt;

&lt;p&gt;Before we implemented predicate pushdown, a query like &lt;code class="language-markup"&gt;SELECT * FROM system.partitions WHERE table_name = 'foo'&lt;/code&gt; fetched and formatted partition information for all tables in the Catalog, and the query engine promptly threw away everything except for &lt;code class="language-markup"&gt;foo&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;After predicate pushdown, the querier avoids fetching and formatting partition information that it determines will be filtered out during query execution.&lt;/p&gt;

&lt;h3 id="multiple-predicates-pushdown"&gt;9.2 Multiple Predicates Pushdown&lt;/h3&gt;

&lt;p&gt;In addition to supporting single predicate pushdown, we extended support to handle &lt;strong&gt;multiple filters&lt;/strong&gt;. The simple examples in Section 9.1 are likely obvious, but in a real system, users can provide arbitrary predicates connected by &lt;code class="language-markup"&gt;AND&lt;/code&gt;, &lt;code class="language-markup"&gt;OR&lt;/code&gt;, &lt;code class="language-markup"&gt;IN&lt;/code&gt;, etc., and it is non-trivial to determine which predicates to push down. For example, consider a query like:&lt;/p&gt;

&lt;pre class="line-numbers"&gt;&lt;code class="language-sql"&gt;SELECT * 
FROM system.partitions 
WHERE (table_name = 'foo' OR table_name = 'bar')
  AND (partition_key = '2024-10|device-101' OR partition_key = '2024-09|device-101')&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;We used DataFusion’s &lt;a href="https://docs.rs/datafusion/latest/datafusion/physical_expr/utils/struct.LiteralGuarantee.html#method.analyze"&gt;LiteralGuarantee::analyze&lt;/a&gt; to parse and simplify query predicates before pushing them down to the Catalog. This method has saved us a lot of engineering time and effort. A big shout-out to the DataFusion community for making this so simple to use!&lt;/p&gt;

&lt;p&gt;The current implementation supports multiple predicates pushdown for filters such as &lt;code class="language-markup"&gt;table_name&lt;/code&gt; combined with &lt;code class="language-markup"&gt;partition_key&lt;/code&gt; or &lt;code class="language-markup"&gt;table_name&lt;/code&gt; combined with &lt;code class="language-markup"&gt;partition_id&lt;/code&gt;, further reducing the amount of data processed by the Querier.&lt;/p&gt;

&lt;h3 id="concurrent-data-fetching"&gt;9.3 Concurrent Data Fetching&lt;/h3&gt;

&lt;p&gt;Previously, the Querier made sequential API calls to the Catalog, where each request had to wait for the previous one to complete before proceeding. This added significant latency, especially when querying large datasets.&lt;/p&gt;

&lt;p&gt;We improved this by enabling &lt;strong&gt;concurrent API requests&lt;/strong&gt;, allowing the Querier to make multiple requests simultaeneously This greatly reduced the time needed to gather all necessary metadata.&lt;/p&gt;

&lt;h2 id="performance-improvements"&gt;10. Performance improvements&lt;/h2&gt;

&lt;p&gt;Here is how much faster querying system tables become when using the filter &lt;code class="language-markup"&gt;WHERE table_name&lt;/code&gt;:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;system.tables: &lt;strong&gt;17% faster&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;system.partitions: &lt;strong&gt;65% faster&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;system.compactor: &lt;strong&gt;60% faster&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;These improvements are based on a database with 100 tables, over 200 partitions, and more than 3,000 parquet files. If your database has more tables, partitions, or Parquet files, you’ll see even more significant performance gains with filtered queries.&lt;/p&gt;

&lt;p&gt;Additionally, the average query latency with multiple filters is around 20 ms for queries like:&lt;/p&gt;

&lt;pre class="line-numbers"&gt;&lt;code class="language-sql"&gt;WHERE table_name = '...' AND partition_key = '...'
WHERE table_name = '...' AND partition_id = ...&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id="wrapping-up"&gt;11. Wrapping up&lt;/h2&gt;

&lt;p&gt;In this post, we explained how we improved the performance of system table queries by implementing predicate pushdown, optimizing the use of multiple filters, and enabling concurrent data fetching. These changes dramatically reduced query times, especially for databases with large amounts of metadata.&lt;/p&gt;

&lt;p&gt;With these optimizations, you can retrieve relevant system data more efficiently and eliminate long debugging waits.&lt;/p&gt;

&lt;p&gt;References:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href="https://docs.influxdata.com/influxdb/cloud-dedicated/admin/query-system-data/"&gt;Query system table in Cloud Dedicated&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://docs.influxdata.com/influxdb/clustered/admin/query-system-data/"&gt;Query system table in Clustered&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;hr /&gt;

&lt;ol&gt;
  &lt;li&gt;Note that in our actual deployments, there are several layers of caching that are not reflected in Figure 1.&lt;/li&gt;
&lt;/ol&gt;
</description>
      <pubDate>Thu, 31 Oct 2024 08:00:00 +0000</pubDate>
      <link>https://www.influxdata.com/blog/system-tables-part-two-influxdb/</link>
      <guid isPermaLink="true">https://www.influxdata.com/blog/system-tables-part-two-influxdb/</guid>
      <category>Developer</category>
      <author>Chunchun Ye (InfluxData)</author>
    </item>
    <item>
      <title>System Tables Part 1: Introduction and Best Practices</title>
      <description>&lt;p&gt;As an InfluxDB &lt;a href="https://www.influxdata.com/products/influxdb-cloud/dedicated/?utm_source=website&amp;amp;utm_medium=direct&amp;amp;utm_campaign=system_tables_intro_part_one_influxdb&amp;amp;utm_content=blog"&gt;Cloud Dedicated&lt;/a&gt; or &lt;a href="https://www.influxdata.com/products/influxdb-clustered/?utm_source=website&amp;amp;utm_medium=direct&amp;amp;utm_campaign=system_tables_intro_part_one_influxdb&amp;amp;utm_content=blog"&gt;Clustered&lt;/a&gt; user, you may want to inspect your cluster to gain a better understanding of the size of your databases, tables, partitions, and compaction status. InfluxDB stores this essential metadata in &lt;em&gt;system tables&lt;/em&gt; (described in Section 1), which help inform decisions about cluster performance and maintenance.&lt;/p&gt;

&lt;h2 id="what-are-system-tableshttpswwwgooglecomurlqhttpsdocsinfluxdatacominfluxdbcloud-dedicatedadminquery-system-dataampsadampsourcedocsampust1728593664930076ampusgaovvaw0j2nrffoeiyz9-9xzz6gum"&gt;1. What are &lt;a href="https://www.google.com/url?q=https://docs.influxdata.com/influxdb/cloud-dedicated/admin/query-system-data&amp;amp;sa=D&amp;amp;source=docs&amp;amp;ust=1728593664930076&amp;amp;usg=AOvVaw0J2NrFfOEIYz9-9XzZ6guM"&gt;system tables&lt;/a&gt;?&lt;/h2&gt;

&lt;p&gt;System tables are “virtual” tables that present metadata for a specific database and provide insights into database storage. Each system table is scoped to a particular database and is read-only, meaning it cannot be modified.&lt;/p&gt;

&lt;p&gt;System tables are hidden by default, as high-frequency access to these tables can interfere with the ongoing operations of the database. Thus, querying system tables requires a special debug header with the request. Once the debug header is added (described in Section 2), you can query system tables using SQL, similar to any other table in InfluxDB.&lt;/p&gt;

&lt;p&gt;Here are the system tables that InfluxDB provides:&lt;/p&gt;

&lt;pre class=""&gt;&lt;code class="language-bash"&gt;+---------------+--------------------+-------------+------------+
| table_catalog | table_schema       | table_name  | table_type |
+---------------+--------------------+-------------+------------+
| ...           | ...                | ...         | ...        |
| public        | system             | compactor   | BASE TABLE |
| public        | system             | partitions  | BASE TABLE |
| public        | system             | queries     | BASE TABLE |
| public        | system             | tables      | BASE TABLE |
| ...           | ...                | ...         | ...        |
+---------------+--------------------+-------------+------------+&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In this blog, we will focus on three tables:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;code class="language-markup"&gt;system.tables&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;code class="language-markup"&gt;system.partitions&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;code class="language-markup"&gt;system.compactor&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;table&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Table&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;Description&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;Schema&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code class="language-markup"&gt;system.tables&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Contains information about tables, such as table name and &lt;a href="https://docs.influxdata.com/influxdb/cloud-dedicated/admin/custom-partitions/partition-templates/"&gt;partition template&lt;/a&gt; in the specific database.&lt;/td&gt;
      &lt;td&gt;&lt;a href="https://docs.influxdata.com/influxdb/cloud-dedicated/admin/query-system-data/#view-systemtables-schema"&gt;Link&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code class="language-markup"&gt;system.partitions&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Contains information about &lt;a href="https://docs.influxdata.com/influxdb/cloud-dedicated/admin/custom-partitions/"&gt;partitions&lt;/a&gt;, partition sizes, file count, etc.&lt;/td&gt;
      &lt;td&gt;&lt;a href="https://docs.influxdata.com/influxdb/cloud-dedicated/admin/query-system-data/#view-systempartitions-schema"&gt;Link&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code class="language-markup"&gt;system.compactor&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Contains detailed information about compacted partitions at different compaction levels.&lt;/td&gt;
      &lt;td&gt;&lt;a href="https://docs.influxdata.com/influxdb/cloud-dedicated/admin/query-system-data/#view-systemcompactor-schema"&gt;Link&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;strong&gt;Warning:&lt;/strong&gt; System tables are not part of InfluxDB’s stable API. They are subject to change, and compatibility is not guaranteed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Warning:&lt;/strong&gt; Querying system tables may impact write and query performance. Use them only for debugging purposes and use filters to optimize queries and minimize their impact on your cluster.&lt;/p&gt;

&lt;h2 id="accessing-system-tables"&gt;2. Accessing system tables&lt;/h2&gt;

&lt;p&gt;To access system tables, you must provide a debug header with the request. The specific commands to add this header vary depending on the client you are using.&lt;/p&gt;

&lt;h4 id="influxctlhttpsdocsinfluxdatacominfluxdbcloud-dedicatedreferencecliinfluxctlquery-cli"&gt;&lt;a href="https://docs.influxdata.com/influxdb/cloud-dedicated/reference/cli/influxctl/query/"&gt;influxctl&lt;/a&gt; CLI&lt;/h4&gt;

&lt;p&gt;For &lt;code class="language-markup"&gt;influxctl&lt;/code&gt;, set the &lt;code class="language-markup"&gt;--enable-system-tables&lt;/code&gt; header:&lt;/p&gt;

&lt;pre class=""&gt;&lt;code class="language-bash"&gt;influxctl query \
  --enable-system-tables \
  --database DATABASE_NAME \
  --token DATABASE_TOKEN \
  "SQL_QUERY"&lt;/code&gt;&lt;/pre&gt;

&lt;h4 id="arrow-flight-sql-or-other-client-libraries"&gt;Arrow Flight SQL or other client libraries&lt;/h4&gt;

&lt;p&gt;For &lt;a href="https://docs.influxdata.com/influxdb/cloud-dedicated/reference/internals/arrow-flightsql/"&gt;Arrow Flight SQL&lt;/a&gt; or &lt;a href="https://docs.influxdata.com/influxdb/cloud-dedicated/query-data/execute-queries/client-libraries/"&gt;other client libraries&lt;/a&gt;, such as Go and Python, set the &lt;code class="language-markup"&gt;iox-debug&lt;/code&gt; header to &lt;code class="language-markup"&gt;true&lt;/code&gt;.&lt;/p&gt;

&lt;h2 id="querying-system-tables-examples"&gt;3. Querying system tables: Examples&lt;/h2&gt;

&lt;h4 id="view-the-partition-template-of-a-specific-table"&gt;1. View the partition template of a specific table&lt;/h4&gt;

&lt;pre class="line-numbers"&gt;&lt;code class="language-sql"&gt;SELECT  *  FROM  system.tables  WHERE  table_name  =  'TABLE_NAME'&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Example Result:&lt;/p&gt;

&lt;pre class=""&gt;&lt;code class="language-bash"&gt;+-----------------+--------------------------------------------------------+
| table_name      | partition_template                                     |
+-----------------+--------------------------------------------------------+
| your_table_name | {"parts":[{"timeFormat":"%Y-%m"},{"tagValue":"col1"}]} |
+-----------------+--------------------------------------------------------+&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;If a table doesn’t include a &lt;a href="https://docs.influxdata.com/influxdb/cloud-dedicated/admin/custom-partitions/partition-templates/?utm_source=website&amp;amp;utm_medium=direct&amp;amp;utm_campaign=system_tables_intro_part_one_influxdb&amp;amp;utm_content=blog"&gt;partition template&lt;/a&gt; in the output of this command, the table uses the default (1 day) partition strategy and doesn’t partition by tags.&lt;/p&gt;

&lt;h4 id="view-the-number-of-partitions-and-total-size-per-table"&gt;2. View the number of partitions and total size per table&lt;/h4&gt;

&lt;pre class="line-numbers"&gt;&lt;code class="language-sql"&gt;SELECT
  table_name,
  COUNT(*) AS partition_count,
  SUM(total_size_mb) AS total_size_mb
FROM system.partitions
WHERE table_name IN ('foo', 'bar', 'baz')
GROUP BY table_name&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Example Result:&lt;/p&gt;

&lt;pre class=""&gt;&lt;code class="language-bash"&gt;+------------+-----------------+---------------+
| table_name | partition_count | total_size_mb |
+------------+-----------------+---------------+
| foo        | 1               | 2             |
| bar        | 4               | 5             |
| baz        | 10              | 23            |
+------------+-----------------+---------------+&lt;/code&gt;&lt;/pre&gt;

&lt;h4 id="view-the-size-for-different-levels-of-compacted-files"&gt;3. View the size for different levels of compacted files*&lt;/h4&gt;

&lt;pre class="line-numbers"&gt;&lt;code class="language-sql"&gt;SELECT
  table_name,
  SUM(total_l0_files) AS l0_files,
  SUM(total_l1_files) AS l1_files,
  SUM(total_l2_files) AS l2_files,
  SUM(total_l0_bytes) AS l0_bytes,
  SUM(total_l1_bytes) AS l1_bytes,
  SUM(total_l2_bytes) AS l2_bytes
FROM system.compactor
WHERE table_name IN ('foo', 'bar', 'baz')
GROUP BY table_name&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;em&gt;*Compacted files are compressed Parquet files processed by the &lt;a href="https://docs.influxdata.com/influxdb/cloud-dedicated/reference/internals/storage-engine/#compactor"&gt;Compactor&lt;/a&gt; to optimize storage. These files have different &lt;a href="https://www.infoworld.com/article/2337820/compactor-a-hidden-engine-of-database-performance.html#compaction-levels"&gt;compaction levels&lt;/a&gt;: L0, L1, and L2. L0, or “Level 0”, represents newly ingested, uncompacted small files, while L2, or “Level 2”, represents compacted, non-overlapping files.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Example Result:&lt;/p&gt;

&lt;pre class=""&gt;&lt;code class="language-bash"&gt;+------------+----------+----------+----------+----------+----------+----------+
| table_name | l0_files | l1_files | l2_files | l0_bytes | l1_bytes | l2_bytes |
+------------+----------+----------+----------+----------+----------+----------+
| foo        | 0        | 1        | 0        | 0        | 20659    | 0        |
| bar        | 0        | 1        | 0        | 0        | 7215     | 0        |
| baz        | 0        | 1        | 0        | 0        | 10784    | 0        |
+------------+----------+----------+----------+----------+----------+----------+&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id="optimize-queries-to-reduce-cluster-impact"&gt;4. Optimize queries to reduce cluster impact&lt;/h2&gt;

&lt;p&gt;Querying system tables can degrade the performance of other common queries, especially if you are trying to view every detail in clusters with hundreds of tables, hundreds of thousands of partitions, and millions of &lt;a href="https://www.influxdata.com/glossary/apache-parquet/?utm_source=website&amp;amp;utm_medium=direct&amp;amp;utm_campaign=system_tables_intro_part_one_influxdb&amp;amp;utm_content=blog"&gt;Parquet files&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;To reduce the performance impact, we suggest selecting information for a specific table or a particular partition by adding filters as follows:&lt;/p&gt;

&lt;pre class="line-numbers"&gt;&lt;code class="language-sql"&gt;WHERE table_name = '...'
WHERE table_name = '...' AND partition_key = '...'
WHERE table_name = '...' AND partition_id = ...
WHERE partition_id = ...&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;See documents on how to obtain &lt;a href="https://docs.influxdata.com/influxdb/cloud-dedicated/admin/custom-partitions/#partition-keys"&gt;partition_key&lt;/a&gt; and &lt;a href="https://docs.influxdata.com/influxdb/cloud-dedicated/admin/query-system-data/#retrieve-a-partition-id"&gt;partition_id&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id="use-the-most-efficient-filters"&gt;5. Use the most efficient filters&lt;/h2&gt;

&lt;p&gt;Among the above filters, the following filters are specially optimized and significantly reduce query latency to about 20 ms, even on our largest clusters:&lt;/p&gt;

&lt;pre class="line-numbers"&gt;&lt;code class="language-sql"&gt;WHERE table_name = '...' AND partition_key = '...'
WHERE table_name = '...' AND partition_id = ...&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id="bringing-it-home"&gt;6. Bringing it home&lt;/h2&gt;

&lt;p&gt;In this first post, we introduced system tables, explained how to access them, and discussed how to optimize your queries with filters. In the next post, we will explain how we improved the performance of system tables.&lt;/p&gt;

&lt;p&gt;References:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href="https://docs.influxdata.com/influxdb/cloud-dedicated/admin/query-system-data/?utm_source=website&amp;amp;utm_medium=direct&amp;amp;utm_campaign=system_tables_intro_part_one_influxdb&amp;amp;utm_content=blog"&gt;Query system table in Cloud Dedicated&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://docs.influxdata.com/influxdb/clustered/admin/query-system-data/?utm_source=website&amp;amp;utm_medium=direct&amp;amp;utm_campaign=system_tables_intro_part_one_influxdb&amp;amp;utm_content=blog"&gt;Query system table in Clustered&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
      <pubDate>Tue, 29 Oct 2024 08:00:00 +0000</pubDate>
      <link>https://www.influxdata.com/blog/system-tables-intro-part-one-influxdb/</link>
      <guid isPermaLink="true">https://www.influxdata.com/blog/system-tables-intro-part-one-influxdb/</guid>
      <category>Developer</category>
      <category>Getting Started</category>
      <author>Chunchun Ye (InfluxData)</author>
    </item>
    <item>
      <title>Querying InfluxDB 3.0 Using JDBC Driver for Tableau</title>
      <description>&lt;p&gt;&lt;a href="https://www.influxdata.com/products/influxdb-overview/"&gt;InfluxDB 3.0&lt;/a&gt; now offers support for connecting Tableau to InfluxDB 3.0 to query data for visualization using the &lt;a href="https://arrow.apache.org/docs/java/flight_sql_jdbc_driver.html"&gt;Apache Arrow Flight SQL JDBC&lt;/a&gt; driver (Flight SQL driver). In this blog post, we will explore the capabilities and benefits of this integration and provide some instructions on how to connect them.&lt;/p&gt;

&lt;h2 id="background"&gt;Background&lt;/h2&gt;

&lt;p&gt;Tableau offers two main ways to connect to third-party data sources: file and server. &lt;a href="https://www.influxdata.com/blog/forecasting-visualizing-time-series-tableau-influxdb-cloud/"&gt;This article&lt;/a&gt; describes how to connect Tableau to &lt;a href="https://www.influxdata.com/glossary/apache-parquet/"&gt;Parquet&lt;/a&gt; files. By contrast, this blog describes how to connect Tableau to an InfluxDB 3.0 database through the Flight SQL driver.&lt;/p&gt;

&lt;div style="padding:56.25% 0 0 0;position:relative;"&gt;&lt;iframe src="https://player.vimeo.com/video/850608638?h=c19a12a5ae&amp;amp;badge=0&amp;amp;autopause=0&amp;amp;player_id=0&amp;amp;app_id=58479" frameborder="0" allow="autoplay; fullscreen; picture-in-picture" allowfullscreen="" style="position:absolute;top:0;left:0;width:100%;height:100%;" title="Query InfluxDB 3.0 Using JDBC Driver for Tableau"&gt;&lt;/iframe&gt;&lt;/div&gt;
&lt;script src="https://player.vimeo.com/api/player.js"&gt;&lt;/script&gt;

&lt;h2 id="what-is-apache-arrow-flight-sql-jdbc-driver"&gt;What is Apache Arrow Flight SQL JDBC Driver?&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://docs.oracle.com/javase/tutorial/jdbc/basics/index.html"&gt;JDBC&lt;/a&gt; (Java Database Connectivity) is a standard way to connect to, and interact with a database. The Flight SQL driver is a JDBC driver implementation that utilizes the underlying &lt;a href="https://arrow.apache.org/docs/format/FlightSql.html"&gt;Flight SQL&lt;/a&gt; protocol, allowing any program that connects via JDBC to seamlessly connect and interact with databases that support Flight SQL. Because InfluxDB 3.0 supports Flight SQL, this driver acts as a bridge, enabling Tableau to establish a connection with InfluxDB 3.0, execute queries, and retrieve time series data for visualization.&lt;/p&gt;

&lt;h2 id="benefits-of-connecting-tableau-to-influxdb-30-using-the-jdbc-driver"&gt;Benefits of connecting Tableau to InfluxDB 3.0 using the JDBC Driver&lt;/h2&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Enhanced performance:&lt;/strong&gt; The Apache Arrow Flight SQL JDBC driver leverages the high-performance capabilities of &lt;a href="https://www.influxdata.com/glossary/apache-arrow/"&gt;Apache Arrow&lt;/a&gt; and the efficient data transfer protocol of &lt;a href="https://www.influxdata.com/glossary/apache-arrow-flight-sql/"&gt;Arrow Flight&lt;/a&gt;. This results in faster data transfers and improved query execution times, enabling Tableau users to retrieve and visualize data from InfluxDB 3.0 with exceptional speed and efficiency.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Real-time visualizations:&lt;/strong&gt; By connecting Tableau to InfluxDB 3.0 using the JDBC driver, users can create interactive visualizations in real-time, as soon as InfluxDB ingests their data.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Seamless integration:&lt;/strong&gt; The Apache Arrow Flight SQL JDBC driver provides a standardized interface for connecting Tableau to InfluxDB 3.0. This means that Tableau users can leverage their existing knowledge and skills to connect to InfluxDB 3.0 without the need for complex data transformations or additional tools.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id="who-can-use-this-feature"&gt;Who can use this feature&lt;/h2&gt;

&lt;p&gt;InfluxDB 3.0 users, both &lt;a href="https://www.influxdata.com/products/influxdb-cloud/serverless/"&gt;InfluxDB Cloud Serverless&lt;/a&gt; and &lt;a href="https://www.influxdata.com/products/influxdb-cloud/dedicated/"&gt;InfluxDB Cloud Dedicated&lt;/a&gt;, have access to this feature on Tableau Desktop as of publication. In the future, additional InfluxDB 3.0 products will also support the JDBC driver.&lt;/p&gt;

&lt;h2 id="how-to-connect-tableau-to-influxdb-30"&gt;How to connect Tableau to InfluxDB 3.0&lt;/h2&gt;

&lt;p&gt;You can find instructions for connecting Tableau to InfluxDB 3.0 &lt;a href="https://docs.influxdata.com/influxdb/cloud-dedicated/query-data/sql/execute-queries/tableau/"&gt;Cloud Dedicated&lt;/a&gt; or &lt;a href="https://docs.influxdata.com/influxdb/cloud-serverless/query-data/sql/execute-queries/tableau/"&gt;Cloud Serverless&lt;/a&gt; in our docs.&lt;/p&gt;

&lt;h2 id="unlocking-real-time-insights"&gt;Unlocking real-time insights&lt;/h2&gt;

&lt;p&gt;The Flight SQL JDBC driver integration between InfluxDB 3.0 and Tableau presents exciting opportunities for visualizing and analyzing time series data in real-time with enhanced performance. By effortlessly connecting these two powerful tools, users can unlock real-time insights, streamline workflows, and create compelling visualizations and reports. The combination of InfluxDB 3.0’s time series capabilities and Tableau’s analytical prowess empowers organizations to make data-driven decisions efficiently and effectively.&lt;/p&gt;
</description>
      <pubDate>Fri, 23 Jun 2023 07:35:00 +0000</pubDate>
      <link>https://www.influxdata.com/blog/querying-influxdb-3-0-using-jdbc-driver-tableau/</link>
      <guid isPermaLink="true">https://www.influxdata.com/blog/querying-influxdb-3-0-using-jdbc-driver-tableau/</guid>
      <category>Product</category>
      <category>Use Cases</category>
      <author>Chunchun Ye (InfluxData)</author>
    </item>
  </channel>
</rss>
