Template built by
Telegraf Plugins used:
- 2 Labels:
- 1 Telegraf Configuration
- 1 Dashboard:
- 1 Variable:
If you have your InfluxDB credentials configured in the CLI, you can install this template with:
Ceph cluster monitoring dashboard
Ceph is a free-software storage platform, implementing object storage on a single distributed computer cluster, and providing interfaces for object-, block- and file-level storage. Ceph aims primarily for completely distributed operation without a single point of failure, to be scalable to the exabyte level, and freely available.
Why monitor your Ceph Cluster system?
Monitoring your Ceph Storage infrastructure is as important as monitoring the containers that your applications run in. Ceph uniquely delivers object, block, and file storage in one unified system. Ceph has become popular for being open source and free to use, and is favored by Kubernetes users for being highly reliable and easy to manage. Ceph delivers extraordinary scalability:
- A Ceph Node leverages commodity hardware and intelligent daemons.
- A Ceph Storage Cluster accommodates large numbers of nodes that communicate with each other to replicate and redistribute data dynamically.
The Ceph Storage Cluster receives data from Ceph Clients – whether it comes through a Ceph Block Device, Ceph Object Storage, the Ceph Filesystem, or a custom implementation you create using librados – and stores the data as objects.
How to use Ceph Cluster Monitoring Template
Once your InfluxDB credentials have been properly configured in the CLI, you can install the Ceph Cluster system Monitoring template using the Quick Install command. Once installed, the data for the dashboard will be populated by the included Telegraf configuration, which includes the relevant Telegraf Prometheus Input Plugin Input. Note that you might need to customize the input configuration to better serve your needs, including by specifying a new input value. All of this will depend on how your organization is currently running Ceph.
To find out more information about environmental variables within the Telegraf configuration, consult the following link.
The Telegraf Prometheus Input Plugin needs to scrape the Ceph MGR(s) Prometheus Metrics Endpoint(s).
Telegraf configuration requires the following environment variables:
INFLUX_TOKEN- The token with the permissions to read
Telegraf configs and writes data to the telegraf bucket. You can just use your operator token to get started.
INFLUX_ORG- The name of your Organization.
CEPH_MGR_SVC_URLS- URLs to Ceph Manager metrics endpoint service(s) e.g.
Any configuration changes reflecting your specific Kubernetes or Ceph installation can be set in the Telegraf configuration manually.
You MUST set these environment variables before running Telegraf using something similar to the following commands:
- This can be found on the
Tokenspage in your browser:
- Your Organization name can be found on the Settings page in your browser:
Key Ceph cluster system monitoring metrics to monitor
Some of the most important Ceph cluster system monitoring metrics that you should proactively monitor include:
- Cluster Latency
- Cluster Capacity
- Cluster Pools
- Cluster I/O
- Objects in Pods
- MON Quorum Status
- MON Quorum Total