‹ Plugins / InfluxDB to Iceberg Data Transfer
Scheduled HTTP

InfluxDB to Iceberg Data Transfer

The InfluxDB to Iceberg plugin moves time series data from InfluxDB 3 into Apache Iceberg for long term retention, lakehouse analytics, and downstream data workflows without custom export pipelines. With scheduled transfers and HTTP triggered backfills, it makes telemetry easier to preserve, share, and use across open, scalable analytics environments.

Configuration

Plugin parameters may be specified as key-value pairs in the --trigger-arguments flag (CLI) or in the trigger_arguments field (API) when creating a trigger. Some plugins support TOML configuration files, which can be specified using the plugin’s config_file_path parameter.

If a plugin supports multiple trigger specifications, some parameters may depend on the trigger specification that you use.

Plugin metadata

This plugin includes a JSON metadata schema in its docstring that defines supported trigger types and configuration parameters. This metadata enables the InfluxDB 3 Explorer UI to display and configure the plugin.

Scheduler trigger parameters

Required parameters

Parameter Type Default Description
measurement string required Source measurement containing data to transfer
window string required Time window for data transfer. Format: <number><unit> (e.g., “1h”, “30d”)
catalog_configs string required Base64-encoded JSON string containing Iceberg catalog configuration

Optional parameters

Parameter Type Default Description
included_fields string all fields/tags Dot-separated list of fields and tags to include (e.g., “usage_user.host”)
excluded_fields string none Dot-separated list of fields and tags to exclude
namespace string “default” Iceberg namespace for the target table
table_name string measurement name Iceberg table name
auto_update_schema string false Automatically update Iceberg table schema when data doesn’t match existing schema

TOML configuration

Parameter Type Default Description
config_file_path string none TOML config file path relative to PLUGIN_DIR (required for TOML configuration)

To use a TOML configuration file, set the PLUGIN_DIR environment variable and specify the config_file_path in the trigger arguments. This is in addition to the --plugin-dir flag when starting InfluxDB 3.

Example TOML configuration

influxdb_to_iceberg_config_scheduler.toml

For more information on using TOML configuration files, see the Using TOML Configuration Files section in the influxdb3_plugins/README.md.

HTTP trigger parameters

Request body structure

Parameter Type Required Description
measurement string Yes Source measurement containing data to transfer
catalog_configs object Yes Iceberg catalog configuration dictionary. See PyIceberg catalog documentation
included_fields array No List of field and tag names to include in replication
excluded_fields array No List of field and tag names to exclude from replication
namespace string No Target Iceberg namespace (default: “default”)
table_name string No Target Iceberg table name (default: measurement name)
batch_size string No Batch size duration for processing (default: “1d”). Format: <number><unit>
backfill_start string No ISO 8601 datetime with timezone for backfill start
backfill_end string No ISO 8601 datetime with timezone for backfill end
auto_update_schema boolean No Automatically update Iceberg table schema when data doesn’t match existing schema (default: false)

Examples

Example 1: Basic scheduled transfer

Transfer CPU metrics to Iceberg every hour:

# Create trigger with base64-encoded catalog config
# Original JSON: {"uri": "http://nessie:9000"}
# Base64: eyJ1cmkiOiAiaHR0cDovL25lc3NpZTo5MDAwIn0=
influxdb3 create trigger \
  --database metrics \
  --path "gh:influxdata/influxdb_to_iceberg/influxdb_to_iceberg.py" \
  --trigger-spec "every:1h" \
  --trigger-arguments 'measurement=cpu,window=24h,catalog_configs="eyJ1cmkiOiAiaHR0cDovL25lc3NpZTo5MDAwIn0="' \
  cpu_to_iceberg

# Write test data
influxdb3 write \
  --database metrics \
  "cpu,host=server1 usage_user=45.2,usage_system=12.1"

# After trigger runs, data is available in Iceberg table "default.cpu"

Expected output

  • Creates Iceberg table default.cpu with schema matching the measurement
  • Transfers all CPU data from the last 24 hours
  • Appends new data on each hourly run

Example 2: HTTP backfill with field filtering

Backfill specific fields from historical data:

# Create and enable HTTP trigger
influxdb3 create trigger \
  --database metrics \
  --path "gh:influxdata/influxdb_to_iceberg/influxdb_to_iceberg.py" \
  --trigger-spec "request:replicate" \
  iceberg_backfill

influxdb3 enable trigger --database metrics iceberg_backfill

# Request backfill via HTTP
curl -X POST http://localhost:8181/api/v3/engine/replicate \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '{
    "measurement": "temperature",
    "catalog_configs": {
      "type": "sql",
      "uri": "sqlite:///path/to/catalog.db"
    },
    "included_fields": ["temp_celsius", "humidity", "sensor_id"],
    "namespace": "weather",
    "table_name": "temperature_history",
    "batch_size": "12h",
    "backfill_start": "2024-01-01T00:00:00+00:00",
    "backfill_end": "2024-01-07T00:00:00+00:00"
  }'

Expected output

  • Creates Iceberg table weather.temperature_history
  • Transfers only temp_celsius and humidity fields
  • Processes data in 12-hour batches for the specified week
  • Returns status of the backfill operation

Example 3: S3-backed Iceberg catalog

Transfer data to Iceberg tables stored in S3:

# Create catalog config JSON
cat > catalog_config.json << EOF
{
  "type": "sql",
  "uri": "sqlite:///iceberg/catalog.db",
  "warehouse": "s3://my-bucket/iceberg-warehouse/",
  "s3.endpoint": "http://minio:9000",
  "s3.access-key-id": "minioadmin",
  "s3.secret-access-key": "minioadmin",
  "s3.path-style-access": true
}
EOF

# Encode to base64
CATALOG_CONFIG=$(base64 < catalog_config.json)

# Create trigger
influxdb3 create trigger \
  --database metrics \
  --path "gh:influxdata/influxdb_to_iceberg/influxdb_to_iceberg.py" \
  --trigger-spec "every:30m" \
  --trigger-arguments "measurement=sensor_data,window=1h,catalog_configs=\"$CATALOG_CONFIG\",namespace=iot,table_name=sensors" \
  s3_iceberg_transfer

Ready to get started?

Download InfluxDB 3 and have InfluxDB to Iceberg Data Transfer running in minutes.