All Integrations / Apache Mesos Monitoring

Apache Mesos Monitoring

Powerful performance with an easy integration, powered by Telegraf, the open source data connector built by InfluxData.

Get This Integration for Free Get This Integration for Free

5B+

Telegraf downloads

Time series database
Source: DB Engines

1B+

Downloads of InfluxDB

2,800+

Contributors

Useful Links

Documentation

Telegraf Quickstart Training

Infrastructure Monitoring with Telegraf

Powerful Performance, Limitless Scale

Collect, organize, and act on massive volumes of high-velocity data. Any data is more valuable when you think of it as time series data. with InfluxDB, the #1 time series platform built to scale with Telegraf.

See Ways to Get Started

Apache Mesos is an open-source project to manage computer clusters. It abstracts CPU, memory, storage and other compute resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to be built and run effectively.

Why use the Apache Mesos Telegraf Plugin?

The Apache Mesos Telegraf Plugin allows you to collect observability metrics provided by the Mesos master and agent nodes and insert them into your InfluxDB instance. The plugin can collect a set of metrics that enable cluster operators to monitor resource usage and detect issues before they become a problem.

How to monitor Apache Mesos using the Telegraf plugin

The Apache Mesos Telegraf Plugin will collect metrics from Apache Mesos and insert them into InfluxDB. By default, this plugin is not configured to gather metrics from Mesos since a cluster can be deployed in numerous ways. You will need to specify master/slave nodes for this plugin to gather metrics from.

Key Apache Mesos metrics to use for monitoring

Some of the important Apache Mesos metrics that you should proactively monitor include:

Resources:

master/cpus_percent Percentage of allocated CPUs
master/cpus_used Number of allocated CPUs
master/cpus_total Number of CPUs
master/cpus_revocable_percent Percentage of allocated revocable CPUs
master/cpus_revocable_total Number of revocable CPUs
master/cpus_revocable_used Number of allocated revocable CPUs
master/disk_percent Percentage of allocated disk space
master/disk_used Allocated disk space in MB
master/disk_total Disk space in MB
master/disk_revocable_percent Percentage of allocated revocable disk space
master/disk_revocable_total Revocable disk space in MB
master/disk_revocable_used Allocated revocable disk space in MB
master/gpus_percent Percentage of allocated GPUs
master/gpus_used Number of allocated GPUs
master/gpus_total Number of GPUs
master/gpus_revocable_percent Percentage of allocated revocable GPUs
master/gpus_revocable_total Number of revocable GPUs
master/gpus_revocable_used Number of allocated revocable GPUs
master/mem_percent Percentage of allocated memory
master/mem_used Allocated memory in MB
master/mem_total Memory in MB
master/mem_revocable_percent Percentage of allocated revocable memory
master/mem_revocable_total Revocable memory in MB
master/mem_revocable_used Allocated revocable memory in MB

Master

master/elected Whether this is the elected master
master/uptime_secs Uptime in seconds

System

system/cpus_total Number of CPUs available in this master node
system/load_15min Load average for the past 15 minutes
system/load_5min Load average for the past 5 minutes
system/load_1min Load average for the past minute
system/mem_free_bytes Free memory in bytes
system/mem_total_bytes Total memory in bytes

Slaves

master/slave_registrations
master/slave_removals
master/slave_reregistrations
master/slave_shutdowns_scheduled
master/slave_shutdowns_canceled
master/slave_shutdowns_completed
master/slaves_active
master/slaves_connected
master/slaves_disconnected
master/slaves_inactive
master/slave_unreachable_canceled
master/slave_unreachable_completed
master/slave_unreachable_scheduled
master/slaves_unreachable

frameworks

master/frameworks_active
master/frameworks_connected
master/frameworks_disconnected
master/frameworks_inactive
master/outstanding_offers

framework offers

master/frameworks/subscribed
master/frameworks/calls_total
master/frameworks/calls
master/frameworks/events_total
master/frameworks/events
master/frameworks/operations_total
master/frameworks/operations
master/frameworks/tasks/active
master/frameworks/tasks/terminal
master/frameworks/offers/sent
master/frameworks/offers/accepted
master/frameworks/offers/declined
master/frameworks/offers/rescinded
master/frameworks/roles/suppressed

tasks

master/tasks_error
master/tasks_failed
master/tasks_finished
master/tasks_killed
master/tasks_lost
master/tasks_running
master/tasks_staging
master/tasks_starting
master/tasks_dropped
master/tasks_gone
master/tasks_gone_by_operator
master/tasks_killing
master/tasks_unreachable

messages

master/invalid_executor_to_framework_messages
master/invalid_framework_to_executor_messages
master/invalid_status_update_acknowledgements
master/invalid_status_updates
master/dropped_messages
master/messages_authenticate
master/messages_deactivate_framework
master/messages_decline_offers
master/messages_executor_to_framework
master/messages_exited_executor
master/messages_framework_to_executor
master/messages_kill_task
master/messages_launch_tasks
master/messages_reconcile_tasks
master/messages_register_framework
master/messages_register_slave
master/messages_reregister_framework
master/messages_reregister_slave
master/messages_resource_request
master/messages_revive_offers
master/messages_status_update
master/messages_status_update_acknowledgement
master/messages_unregister_framework
master/messages_unregister_slave
master/messages_update_slave
master/recovery_slave_removals
master/slave_removals/reason_registered
master/slave_removals/reason_unhealthy
master/slave_removals/reason_unregistered
master/valid_framework_to_executor_messages
master/valid_status_update_acknowledgements
master/valid_status_updates
master/task_lost/source_master/reason_invalid_offers
master/task_lost/source_master/reason_slave_removed
master/task_lost/source_slave/reason_executor_terminated
master/valid_executor_to_framework_messages
master/invalid_operation_status_update_acknowledgements
master/messages_operation_status_update_acknowledgement
master/messages_reconcile_operations
master/messages_suppress_offers
master/valid_operation_status_update_acknowledgements

evqueue

master/event_queue_dispatches
master/event_queue_http_requests
master/event_queue_messages
master/operator_event_stream_subscribers

registrar

registrar/state_fetch_ms
registrar/state_store_ms
registrar/state_store_ms/max
registrar/state_store_ms/min
registrar/state_store_ms/p50
registrar/state_store_ms/p90
registrar/state_store_ms/p95
registrar/state_store_ms/p99
registrar/state_store_ms/p999
registrar/state_store_ms/p9999
registrar/state_store_ms/count
registrar/log/ensemble_size
registrar/log/recovered
registrar/queued_operations
registrar/registry_size_bytes

allocator

allocator/allocation_run_ms
allocator/allocation_run_ms/count
allocator/allocation_run_ms/max
allocator/allocation_run_ms/min
allocator/allocation_run_ms/p50
allocator/allocation_run_ms/p90
allocator/allocation_run_ms/p95
allocator/allocation_run_ms/p99
allocator/allocation_run_ms/p999
allocator/allocation_run_ms/p9999
allocator/allocation_runs
allocator/allocation_run_latency_ms
allocator/allocation_run_latency_ms/count
allocator/allocation_run_latency_ms/max
allocator/allocation_run_latency_ms/min
allocator/allocation_run_latency_ms/p50
allocator/allocation_run_latency_ms/p90
allocator/allocation_run_latency_ms/p95
allocator/allocation_run_latency_ms/p99
allocator/allocation_run_latency_ms/p999
allocator/allocation_run_latency_ms/p9999
allocator/roles/shares/dominant
allocator/event_queue_dispatches
allocator/offer_filters/roles/active
allocator/quota/roles/resources/offered_or_allocated
allocator/quota/roles/resources/guarantee
allocator/resources/cpus/offered_or_allocated
allocator/resources/cpus/total
allocator/resources/disk/offered_or_allocated
allocator/resources/disk/total
allocator/resources/mem/offered_or_allocated
allocator/resources/mem/total

Mesos slave metric groups

resources
- slave/cpus_percent
- slave/cpus_used
- slave/cpus_total
- slave/cpus_revocable_percent
- slave/cpus_revocable_total
- slave/cpus_revocable_used
- slave/disk_percent
- slave/disk_used
- slave/disk_total
- slave/disk_revocable_percent
- slave/disk_revocable_total
- slave/disk_revocable_used
- slave/gpus_percent
- slave/gpus_used
- slave/gpus_total,
- slave/gpus_revocable_percent
- slave/gpus_revocable_total
- slave/gpus_revocable_used
- slave/mem_percent
- slave/mem_used
- slave/mem_total
- slave/mem_revocable_percent
- slave/mem_revocable_total
- slave/mem_revocable_used
agent
- slave/registered
- slave/uptime_secs
system
- system/cpus_total
- system/load_15min
- system/load_5min
- system/load_1min
- system/mem_free_bytes
- system/mem_total_bytes
executors
- containerizer/mesos/container_destroy_errors
- slave/container_launch_errors
- slave/executors_preempted
- slave/frameworks_active
- slave/executor_directory_max_allowed_age_secs
- slave/executors_registering
- slave/executors_running
- slave/executors_terminated
- slave/executors_terminating
- slave/recovery_errors
tasks
- slave/tasks_failed
- slave/tasks_finished
- slave/tasks_killed
- slave/tasks_lost
- slave/tasks_running
- slave/tasks_staging
- slave/tasks_starting
messages
- slave/invalid_framework_messages
- slave/invalid_status_updates
- slave/valid_framework_messages
- slave/valid_status_updates

You can learn more about Apache Meso metrics on their documentation page.

For more information, please check out the documentation.

Project URL Documentation

Useful Links

Documentation

Telegraf Quickstart Training

Infrastructure Monitoring with Telegraf

Powerful Performance, Limitless Scale

See Ways to Get Started

Apache Mesos Monitoring

Table of Contents

Useful Links

Powerful Performance, Limitless Scale

Why use the Apache Mesos Telegraf Plugin?

How to monitor Apache Mesos using the Telegraf plugin

Key Apache Mesos metrics to use for monitoring

Useful Links

Powerful Performance, Limitless Scale

Related Integrations

Start building now

Developer Education

Product & Solutions

Developers

Company

Get the latest InfluxDB 3 updates

Apache Mesos Monitoring

Table of Contents

Useful Links

Powerful Performance, Limitless Scale

Why use the Apache Mesos Telegraf Plugin?

How to monitor Apache Mesos using the Telegraf plugin

Key Apache Mesos metrics to use for monitoring

Useful Links

Powerful Performance, Limitless Scale

Related Integrations

Customer Case Study: Oracle

Infrastructure and application monitoring

Why InfluxDB for Kubernetes monitoring?

Start building now

Developer Education

Product & Solutions

Developers

Company

Follow Us