InfluxDB to Iceberg Data Transfer
The InfluxDB to Iceberg plugin moves time series data from InfluxDB 3 into Apache Iceberg for long term retention, lakehouse analytics, and downstream data workflows without custom export pipelines. With scheduled transfers and HTTP triggered backfills, it makes telemetry easier to preserve, share, and use across open, scalable analytics environments.
Configuration
Plugin parameters may be specified as key-value pairs in the --trigger-arguments flag (CLI) or in the trigger_arguments field (API) when creating a trigger. Some plugins support TOML configuration files, which can be specified using the plugin’s config_file_path parameter.
If a plugin supports multiple trigger specifications, some parameters may depend on the trigger specification that you use.
Plugin metadata
This plugin includes a JSON metadata schema in its docstring that defines supported trigger types and configuration parameters. This metadata enables the InfluxDB 3 Explorer UI to display and configure the plugin.
Scheduler trigger parameters
Required parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
measurement |
string | required | Source measurement containing data to transfer |
window |
string | required | Time window for data transfer. Format: <number><unit> (e.g., “1h”, “30d”) |
catalog_configs |
string | required | Base64-encoded JSON string containing Iceberg catalog configuration |
Optional parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
included_fields |
string | all fields/tags | Dot-separated list of fields and tags to include (e.g., “usage_user.host”) |
excluded_fields |
string | none | Dot-separated list of fields and tags to exclude |
namespace |
string | “default” | Iceberg namespace for the target table |
table_name |
string | measurement name | Iceberg table name |
auto_update_schema |
string | false | Automatically update Iceberg table schema when data doesn’t match existing schema |
TOML configuration
| Parameter | Type | Default | Description |
|---|---|---|---|
config_file_path |
string | none | TOML config file path relative to PLUGIN_DIR (required for TOML configuration) |
To use a TOML configuration file, set the PLUGIN_DIR environment variable and specify the config_file_path in the trigger arguments. This is in addition to the --plugin-dir flag when starting InfluxDB 3.
Example TOML configuration
influxdb_to_iceberg_config_scheduler.toml
For more information on using TOML configuration files, see the Using TOML Configuration Files section in the influxdb3_plugins/README.md.
HTTP trigger parameters
Request body structure
| Parameter | Type | Required | Description |
|---|---|---|---|
measurement |
string | Yes | Source measurement containing data to transfer |
catalog_configs |
object | Yes | Iceberg catalog configuration dictionary. See PyIceberg catalog documentation |
included_fields |
array | No | List of field and tag names to include in replication |
excluded_fields |
array | No | List of field and tag names to exclude from replication |
namespace |
string | No | Target Iceberg namespace (default: “default”) |
table_name |
string | No | Target Iceberg table name (default: measurement name) |
batch_size |
string | No | Batch size duration for processing (default: “1d”). Format: <number><unit> |
backfill_start |
string | No | ISO 8601 datetime with timezone for backfill start |
backfill_end |
string | No | ISO 8601 datetime with timezone for backfill end |
auto_update_schema |
boolean | No | Automatically update Iceberg table schema when data doesn’t match existing schema (default: false) |
Examples
Example 1: Basic scheduled transfer
Transfer CPU metrics to Iceberg every hour:
# Create trigger with base64-encoded catalog config
# Original JSON: {"uri": "http://nessie:9000"}
# Base64: eyJ1cmkiOiAiaHR0cDovL25lc3NpZTo5MDAwIn0=
influxdb3 create trigger \
--database metrics \
--path "gh:influxdata/influxdb_to_iceberg/influxdb_to_iceberg.py" \
--trigger-spec "every:1h" \
--trigger-arguments 'measurement=cpu,window=24h,catalog_configs="eyJ1cmkiOiAiaHR0cDovL25lc3NpZTo5MDAwIn0="' \
cpu_to_iceberg
# Write test data
influxdb3 write \
--database metrics \
"cpu,host=server1 usage_user=45.2,usage_system=12.1"
# After trigger runs, data is available in Iceberg table "default.cpu"
Expected output
- Creates Iceberg table
default.cpuwith schema matching the measurement - Transfers all CPU data from the last 24 hours
- Appends new data on each hourly run
Example 2: HTTP backfill with field filtering
Backfill specific fields from historical data:
# Create and enable HTTP trigger
influxdb3 create trigger \
--database metrics \
--path "gh:influxdata/influxdb_to_iceberg/influxdb_to_iceberg.py" \
--trigger-spec "request:replicate" \
iceberg_backfill
influxdb3 enable trigger --database metrics iceberg_backfill
# Request backfill via HTTP
curl -X POST http://localhost:8181/api/v3/engine/replicate \
-H "Authorization: Bearer YOUR_TOKEN" \
-d '{
"measurement": "temperature",
"catalog_configs": {
"type": "sql",
"uri": "sqlite:///path/to/catalog.db"
},
"included_fields": ["temp_celsius", "humidity", "sensor_id"],
"namespace": "weather",
"table_name": "temperature_history",
"batch_size": "12h",
"backfill_start": "2024-01-01T00:00:00+00:00",
"backfill_end": "2024-01-07T00:00:00+00:00"
}'
Expected output
- Creates Iceberg table
weather.temperature_history - Transfers only
temp_celsiusandhumidityfields - Processes data in 12-hour batches for the specified week
- Returns status of the backfill operation
Example 3: S3-backed Iceberg catalog
Transfer data to Iceberg tables stored in S3:
# Create catalog config JSON
cat > catalog_config.json << EOF
{
"type": "sql",
"uri": "sqlite:///iceberg/catalog.db",
"warehouse": "s3://my-bucket/iceberg-warehouse/",
"s3.endpoint": "http://minio:9000",
"s3.access-key-id": "minioadmin",
"s3.secret-access-key": "minioadmin",
"s3.path-style-access": true
}
EOF
# Encode to base64
CATALOG_CONFIG=$(base64 < catalog_config.json)
# Create trigger
influxdb3 create trigger \
--database metrics \
--path "gh:influxdata/influxdb_to_iceberg/influxdb_to_iceberg.py" \
--trigger-spec "every:30m" \
--trigger-arguments "measurement=sensor_data,window=1h,catalog_configs=\"$CATALOG_CONFIG\",namespace=iot,table_name=sensors" \
s3_iceberg_transfer
Ready to get started?
Download InfluxDB 3 and have running in minutes.