Mesosphere DC/OS Telegraf Input Plugin

Use This InfluxDB Integration for Free

Apache Mesos is an open-source project to manage computer clusters. It abstracts CPU, memory, storage and other compute resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to be built and run effectively.

Originally developed at the University of California at Berkeley, Mesos was built to both provide efficient resource isolation and sharing across multiple distributed applications or frameworks. It is designed to sit between the application layer and the operating system in an environment, thus making it easier to deploy and manage applications in larger clustered environments. It can also run many different applications on a dynamically shared pool of nodes as well. Just a few of the most prominent users of Apache Mesos include but are not limited to ones like Airbnb, Xogito and Twitter.

Why use the Apache Mesos Telegraf Plugin?

The Apache Mesos Telegraf Plugin allows you to collect observability metrics provided by the Mesos master and agent nodes and insert them into your InfluxDB instance. The plugin can collect a set of metrics that enable cluster operators to monitor resource usage and detect issues before they become a problem.

Very similarly to the way the Windows operating system would manage the resources of a desktop computer, Apache Mesos is built to make sure that applications always have access to whatever they need in a cluster at all times. Rather than going through the trouble of setting up multiple server clusters for different parts of an application, Mesos makes it possible to share a pool of servers that all run different parts of your application - all without them interfering with one another.

Obviously, monitoring such an environment would become of critical importance because if an issue is allowed to go by undetected, it could bring critical functionality of your application to a halt. Mesos also allows you to eliminate a lot of the tedious manual steps that were formerly required to deploy applications and can also shift workloads around automatically to offer better fault tolerance and to keep utilization rates as high as possible, which is another major reason why you’d want to monitor it at all times.

How to monitor Mesosphere DC/OS using the Telegraf plugin

Configuring the Apache Aurora Telegraf plugin is basic - configuring for the type of role to collect metrics from (leader or follower), timeouts, basic authentication, and the optional TLS configuration. Once you have this set up, you can start putting your metrics in your InfluxDB instance.

Key Apache Aurora metrics to use for monitoring

The Apache Mesos Telegraf Plugin will collect metrics from Apache Mesos and insert them into InfluxDB. By default, this plugin is not configured to gather metrics from Mesos since a cluster can be deployed in numerous ways. You will need to specify master/slave nodes for this plugin to gather metrics from.

To properly configure the Apache Mesos Telegraf plugin in your own environment, simply use the following commands and fill in the blanks with the information relevant to your project:

# Telegraf plugin for gathering metrics from N Mesos masters
[[inputs.mesos]]
  ## Timeout, in ms.
  timeout = 100

  ## A list of Mesos masters.
  masters = ["http://localhost:5050"]

  ## Master metrics groups to be collected, by default, all enabled.
  master_collections = [
    "resources",
    "master",
    "system",
    "agents",
    "frameworks",
    "framework_offers",
    "tasks",
    "messages",
    "evqueue",
    "registrar",
    "allocator",
  ]

  ## A list of Mesos slaves, default is []
  # slaves = []

  ## Slave metrics groups to be collected, by default, all enabled.
  # slave_collections = [
  #   "resources",
  #   "agent",
  #   "system",
  #   "executors",
  #   "tasks",
  #   "messages",
  # ]

  ## Optional TLS Config
  # tls_ca = "/etc/telegraf/ca.pem"
  # tls_cert = "/etc/telegraf/cert.pem"
  # tls_key = "/etc/telegraf/key.pem"
  ## Use TLS but skip chain & host verification
  # insecure_skip_verify = false

Key Apache Mesos metrics to use for monitoring

/vars endpoint;

For more information, please check out the documentation.

Project URL   Documentation

Related resources

InfluxDb-cloud-logo

The most powerful time series
database as a service

Get Started for Free
Influxdbu

Developer Education

Training for time series app developers.

View All Education