Threshold Deadman Checks
The Threshold Deadman Checks plugin detects both out of bounds metrics and missing telemetry in one place, so teams know when values cross critical limits or when a heartbeat, sensor, or service stops reporting. It helps catch failures earlier, reduce blind spots, and simplify alerting for infrastructure, uptime, application performance, and IoT monitoring.
Configuration
Plugin parameters may be specified as key-value pairs in the --trigger-arguments flag (CLI) or in the trigger_arguments field (API) when creating a trigger.
Plugin metadata
This plugin includes a JSON metadata schema in its docstring that defines supported trigger types and configuration parameters. This metadata enables the InfluxDB 3 Explorer UI to display and configure the plugin.
Required parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
measurement |
string | required | Measurement to monitor for deadman alerts and aggregation-based conditions |
senders |
string | required | Dot-separated notification channels with multi-channel notification integration |
window |
string | required | Time window for periodic data presence checking |
Data write trigger parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
measurement |
string | required | Measurement to monitor for real-time threshold violations in dual monitoring mode |
field_conditions |
string | required | Real-time threshold conditions with multi-level alerting (INFO, WARN, ERROR, CRITICAL severity levels) |
senders |
string | required | Dot-separated notification channels with multi-channel notification integration |
Threshold check parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
field_aggregation_values |
string | none | Multi-level aggregation conditions with aggregation support for avg, min, max, count, sum, median, stddev, first_value, last_value, var, and approx_median values |
deadman_check |
boolean | false | Enable deadman detection to monitor for data absence and missing data streams |
interval |
string | “5min” | Configurable aggregation time interval for batch processing with performance optimization |
trigger_count |
number | 1 | Configurable triggers requiring multiple consecutive failures before alerting |
Notification parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
influxdb3_auth_token |
string | env var | InfluxDB 3 API token with environment variable support |
notification_deadman_text |
string | template | Customizable deadman alert template message with dynamic variables |
notification_threshold_text |
string | template | Customizable threshold alert template message with dynamic variables |
notification_text |
string | template | Customizable notification template message for data write triggers with dynamic variables |
notification_path |
string | “notify” | Notification endpoint path with retry logic and exponential backoff |
port_override |
number | 8181 | InfluxDB port override for notification delivery |
TOML configuration
| Parameter | Type | Default | Description |
|---|---|---|---|
config_file_path |
string | none | TOML config file path relative to PLUGIN_DIR (required for TOML configuration) |
To use a TOML configuration file, set the PLUGIN_DIR environment variable and specify the config_file_path in the trigger arguments. This is in addition to the --plugin-dir flag when starting InfluxDB 3.
Example TOML configuration files provided:
- threshold_deadman_config_scheduler.toml - for scheduled triggers
- threshold_deadman_config_data_writes.toml - for data write triggers
For more information on using TOML configuration files, see the Using TOML Configuration Files section in the influxdb3_plugins/README.md.
Channel-specific configuration
Notification channels require additional parameters based on the sender type (same as the influxdata/notifier plugin).
Examples
Example 1: Basic threshold monitoring
Write test data and monitor for threshold violations:
# Write test data
influxdb3 write \
--database sensors \
"heartbeat,host=server1 status=1"
influxdb3 write \
--database sensors \
"heartbeat,host=server1 status=0"
# Create and enable the trigger
influxdb3 create trigger \
--database sensors \
--path "gh:influxdata/threshold_deadman_checks/threshold_deadman_checks_plugin.py" \
--trigger-spec "every:5m" \
--trigger-arguments "measurement=heartbeat,senders=slack,window=5m,deadman_check=true,slack_webhook_url=$SLACK_WEBHOOK_URL" \
heartbeat_monitor
influxdb3 enable trigger --database sensors heartbeat_monitor
# Query to verify data
influxdb3 query \
--database sensors \
"SELECT * FROM heartbeat ORDER BY time DESC LIMIT 5"
Set SLACK_WEBHOOK_URL to your Slack incoming webhook URL.
Expected output
When no data is received within the window, a deadman alert is sent: “CRITICAL: No heartbeat data from heartbeat between 2025-06-01T10:00:00Z and 2025-06-01T10:05:00Z”
Example 2: Deadman monitoring
Monitor for data absence and alert when no data is received:
influxdb3 create trigger \
--database sensors \
--path "gh:influxdata/threshold_deadman_checks/threshold_deadman_checks_plugin.py" \
--trigger-spec "every:15m" \
--trigger-arguments "measurement=heartbeat,senders=sms,window=10m,deadman_check=true,trigger_count=2,twilio_from_number=+1234567890,twilio_to_number=+0987654321,notification_deadman_text=CRITICAL: No heartbeat data from \$table between \$time_from and \$time_to" \
heartbeat_monitor
Multi-level threshold monitoring
Monitor aggregated values with different severity levels:
influxdb3 create trigger \
--database monitoring \
--path "gh:influxdata/threshold_deadman_checks/threshold_deadman_checks_plugin.py" \
--trigger-spec "every:5m" \
--trigger-arguments "measurement=system_metrics,senders=slack.discord,field_aggregation_values='cpu_usage:avg@>=80-WARN cpu_usage:avg@>=95-ERROR memory_usage:max@>=90-WARN',window=5m,interval=1min,trigger_count=3,slack_webhook_url=$SLACK_WEBHOOK_URL,discord_webhook_url=$DISCORD_WEBHOOK_URL" \
system_threshold_monitor
Set SLACK_WEBHOOK_URL and DISCORD_WEBHOOK_URL to your webhook URLs.
Real-time field condition monitoring
Monitor data writes for immediate threshold violations:
influxdb3 create trigger \
--database applications \
--path "gh:influxdata/threshold_deadman_checks/threshold_deadman_checks_plugin.py" \
--trigger-spec "all_tables" \
--trigger-arguments "measurement=response_times,field_conditions=latency>500-WARN:latency>1000-ERROR:error_rate>0.05-CRITICAL,senders=http,trigger_count=1,http_webhook_url=$HTTP_WEBHOOK_URL,notification_text=[\$level] Application alert: \$field \$op_sym \$compare_val (actual: \$actual)" \
app_performance_monitor
Set HTTP_WEBHOOK_URL to your HTTP webhook endpoint.
Combined monitoring
Monitor both aggregation thresholds and deadman conditions:
influxdb3 create trigger \
--database comprehensive \
--path "gh:influxdata/threshold_deadman_checks/threshold_deadman_checks_plugin.py" \
--trigger-spec "every:10m" \
--trigger-arguments "measurement=temperature_sensors,senders=whatsapp,field_aggregation_values='temperature:avg@>=35-WARN temperature:max@>=40-ERROR',window=15m,deadman_check=true,trigger_count=2,twilio_from_number=+1234567890,twilio_to_number=+0987654321" \
comprehensive_sensor_monitor
Ready to get started?
Download InfluxDB 3 and have running in minutes.