A New Way to Debug Query Performance in Cloud Dedicated

By Reid Kaufmann Developer
Jan 20, 2026

Navigate to:

I’d like to share a new influxctl ease-of-use feature in v2.12.0 that makes it easier to optimize important queries or debug slow ones. influxctl has had the capability to send queries and display the results in JSON or tabular formats for some time. (Note: this CLI utility is specific to Cloud Dedicated and Clustered, as are many of the specifics in this post.) In Clustered, you can monitor querier pods’ logs, and in both Dedicated and Clustered, metrics on individual queries’ performance can be found in the system tables. Both of those options offer a lot of data—enough that it can be hard to digest quickly. Additionally, associating a single execution of a query to its log entry is tedious. A new feature, the --perf-debug flag for the influxctl query command (release notes), accelerates the experimentation cycle by providing real-time feedback, allowing you to stay in the context of your shell as you tweak your query.

Sample output

The new flag, --perf-debug, will execute a query, collect and discard the results, and emit execution metrics instead. When --format is omitted, output defaults to a tabular format with units dynamically chosen for human readability. In the second execution below, --format json is specified to emit a data format appropriate for programmatic consumption: in a nod to the querier log, it uses keys with shorter variable names, delimits words with underscores, and sports consistent units (bytes, seconds as a float).

In the tabular format, you can also see a demarcation between client and server metrics.

$ influxctl query --perf-debug --token REDACTED --database reidtest3 --language influxql "SELECT SUM(i), non_negative_difference(SUM(i)) as diff_i FROM data WHERE time > '2025-11-07T01:20:00Z' AND time " '2025-11-07T03:00:00Z' AND runid = '540cd752bb6411f0a23e30894adea878' GROUP BY time(5m)"
+--------------------------+----------+
| Metric                   | Value    |
+--------------------------+----------+
| Client Duration          | 1.222 s  |
| Output Rows              | 20       |
| Output Size              | 647 B    |
+--------------------------+----------+
| Compute Duration         | 37.2 ms  |
| Execution Duration       | 243.8 ms |
| Ingester Latency Data    | 0        |
| Ingester Latency Plan    | 0        |
| Ingester Partition Count | 0        |
| Ingester Response        | 0 B      |
| Ingester Response Rows   | 0        |
| Max Memory               | 70 KiB   |
| Parquet Files            | 1        |
| Partitions               | 1        |
| Planning Duration        | 9.6 ms   |
| Queue Duration           | 286.6 µs |
+--------------------------+----------+

$ influxctl query --perf-debug --format json --token REDACTED --database reidtest3 --language influxql "SELECT SUM(i), non_negative_difference(SUM(i)) as diff_i FROM data WHERE time > '2025-11-07T01:20:00Z' AND time " '2025-11-07T03:00:00Z' AND runid = '540cd752bb6411f0a23e30894adea878' GROUP BY time(5m)"
{
  "client_duration_secs": 1.101,
  "compute_duration_secs": 0.037,
  "execution_duration_secs": 0.247,
  "ingester_latency_data": 0,
  "ingester_latency_plan": 0,
  "ingester_partition_count": 0,
  "ingester_response_bytes": 0,
  "ingester_response_rows": 0,
  "max_memory_bytes": 71744,
  "output_bytes": 647,
  "output_rows": 20,
  "parquet_files": 1,
  "partitions": 1,
  "planning_duration_secs": 0.009,
  "queue_duration_secs": 0
  }

Notes

Client duration includes the time to open the connection to the server. In the example, you can see a big delta between that and the server’s total duration. When I ran this command, my client and database server were not colocated. Additionally, influxctl may not be tuned for optimal connection latency. Your native client probably caches connections and might not suffer this latency. When tuning your query, it’s more important to look at the durations recorded by the server.

Output size is the size in Arrow format in memory, after gzip inflation (if client and server agree on compression), so this metric does not report the bytes transferred. The network bytes transferred might be more useful, so that’s a potential future enhancement. However, the current metric can still provide a relative metric to compare between different queries.

Ingester metrics are zeroed out if the ingester has no partitions with unpersisted data matching the query. In Serverless, Dedicated, and Clustered, queries always consult ingesters, so the 0 in ingester latency can be misleading.

Parquet files indicate how many files were traversed for the query. However, if the query was optimized by a ProgressiveEvalExec plan (simple sorted LIMIT queries without aggregations, typically; verify with EXPLAIN ANALYZE), this value may not be useful because it is calculated in planning and then reflects the potential number of files to be accessed determined by the time range, as opposed to the actual number accessed before reaching the LIMIT. For most queries, this metric is a handy indicator, but it’s worth noting that the query log also contains a related metric, deduplicated_parquet_files, which tells us how many of the files had overlapping time ranges, requiring the querier to merge/sort/deduplicate data. It’s normal to have a few files at the leading edge, but this operation becomes a serious bottleneck if too much data needs to be deduplicated (the main responsibility of the compactor is to manage this problem).

Query durations vary, and a query can be executed several times (and at different times of day) to get a sense of the variation.

Potential sources of latency or variability

Cache warmup (shows up in Execution Duration): The first few times a query in a particular time frame for a table is executed, the duration may be significantly higher due to parquet cache misses. Queries fan out to multiple queriers in a round robin fashion, and each querier has an independent parquet cache, so expect a few cache misses as each querier may have to incur the delay of retrieving parquet files from the object store. Due to multiple load balancer pods and other clients executing queries, how many query executions it will take to warm up all queries is indeterminate. If the queries are on the “leading” edge of the data, be aware that persistence of new data OR compaction may also periodically cause cache misses. Large L2 file compactions lead to a greater disruption, while latency from typical small incremental persists may be imperceptible.

Corollary: cache eviction. Other queries executing may cause cache eviction to make room for their data. Given a high rate of queries covering a lot of data (many series and/or a wide time frame), it’s possible to thrash the cache. In this case, influxctl can’t provide much context about other queries running at the same time (exception: a non-zero Queue Duration does indicate maximum execution concurrency was reached). You may still need to review the query log or observability dashboards. Some query loads are cyclical, and so is the work of the compactor, depending on ingest and partitioning rates; therefore, you may get better performance in the afternoon than in the morning. When the CPU is maxed out, it tends to increase all recorded server latencies.

Variation in data density or volume will affect all queries to some degree, but will impact computationally intensive queries the most. This shows up in Execution Duration. Monitor the Parquet Files or the Output Rows metrics as possible proxies for this. Be aware that changing a tag value in the WHERE clause or the time constraints may affect latency, depending on the underlying data. Not all writers may employ the same frequency. When tuning aggregate queries, you may occasionally want to add a COUNT() field and drop the --perf-debug flag to see how many records are contributing. For some queries (SELECT DISTINCT, for example), tag cardinality and time range can greatly impact performance.

See documentation for more general information on query optimization.

Other things to try

Check if Planning Duration is substantially higher than Execution Duration. This can be caused by high numbers of tables or partitions, which may be excessive or intended. Custom partitioning can help reduce execution latency, but can increase planning latency—find the right balance for your workload.
Check if Ingester Latency or Response is abnormally high/large. It may indicate a need for, or a problem with, custom partitioning, resulting in excessive delay in persisting partitions.
If Parquet Files is abnormally large, check that the query has a time constraint and that it’s reasonable. Also, check observability dashboards to see if the compactor is not keeping up or look for skipped partitions. If customer partitioning on a tag is in use, make sure the query is specifying a value for that tag in the WHERE clause (also note that regexes that don’t equate to simple equality checks on said field will also prevent partition pruning).
How much does increasing or decreasing the time range of the query change the execution metrics?
Compare similar queries against different tables, schemas, or partitioning schemes.
Compare different means of achieving the same result (SQL ORDER BY time DESC LIMIT 1 vs INFLUXQL LAST)

You can learn a lot through experimentation and finding correlations beyond those suggested here. We hope this minor feature makes it a little easier!

Navigate to:

Stop flying blind

A New Way to Debug Query Performance in Cloud Dedicated

By Reid Kaufmann Developer
Jan 20, 2026

Navigate to:

Sample output

Notes

Potential sources of latency or variability

Other things to try

Ready to get started?

InfluxDB 3 Core & Enterprise GA: The Next Generation Time Series Platform for Developers is Here

Data Lakes and Warehouses

InfluxDB for Industrial IoT:
A Live Demonstration

Time Series Databases Explained

Network Monitoring

Time Series Data Analysis: Definitions and Best Techniques in 2025

Product & Solutions

Developers

Company

Navigate to:

Stop flying blind

Get Updates

A New Way to Debug Query Performance in Cloud Dedicated

By Reid Kaufmann Developer Jan 20, 2026

Navigate to:

Sample output

Notes

Potential sources of latency or variability

Other things to try

Ready to get started?

More in Developer

From Reactive to Predictive: Preserving BESS Uptime at Scale

A Practical Guide to SCADA Security

The "Now" Problem: Why BESS Operations Demand Last Value Caching

What Is Predictive Analytics? A Complete Guide for 2026

Node-RED Dashboard Tutorial

From Legacy Data Historians to a Modern, Open Industrial Data Stack

InfluxDB 3 Core & Enterprise GA: The Next Generation Time Series Platform for Developers is Here

Data Lakes and Warehouses

InfluxDB for Industrial IoT: A Live Demonstration

Time Series Databases Explained

Network Monitoring

Time Series Data Analysis: Definitions and Best Techniques in 2025

Product & Solutions

Developers

Company

Follow Us

By Reid Kaufmann Developer
Jan 20, 2026

InfluxDB for Industrial IoT:
A Live Demonstration