CGroups (also known as control groups) is a Linux kernel feature that can be organized into hierarchical groups in order to limit, account for, and isolate the resource usage (CPU, memory, disk I/O, network, etc.) of a collection of processes.
Why use a Telegraf plugin for CGroup?
With CGroups, you can allocate resources among groups of tasks/processes running in a system commonly used for your infrastructure and application environments. When you monitor the CGroups that you configure with the CGroup Telegraf Plugin, the metrics collected will help you to determine whether to allow or deny the CGroups access to resources. This information can also help you reconfigure your CGroups dynamically on a running system.
How to monitor CGroup using the Telegraf plugin
CGroups allow you to allocate resources — such as CPU time, system memory, network bandwidth, or combinations of these resources — among user-defined groups of tasks (processes) running on a system. You can monitor the CGroups you configure, deny CGroups access to certain resources, and even reconfigure your CGroups dynamically on a running system.
This Telegraf input plugin will capture specific statistics per CGroup. Consider restricting paths to the set of CGroups you really want to monitor if you have a large number of CGroups, to avoid any cardinality issues.
Key CGroup metrics to use for monitoring
Some of the important CGroup metrics that you should proactively monitor include:
- Memory usage
- Memory limits by bytes
- CPU consumption by container CGroups or children groups under CGroups