DiskIO Metrics for Monitoring

Use This InfluxDB Integration for Free

Disk I/O is the speed with which the data transfer takes place between the hard disk drive and RAM. You do so by monitoring the read/write operations of a disk, and this is used to measure the performance of storage devices.

Why use the DiskIO Telegraf Plugin?

The DiskIO Telegraf Plugin will collect read/write operations of a disk which you could combine with other metrics like CPU usage, free disk space, and a whole host of other metrics that could give you a comprehensive view of your infrastructure.

How to use the DiskIO Telegraf Plugin

By default, the DiskIO Telegraf Plugin will gather metrics for all devices including disk partitions. Setting devices will restrict the stats to the specified devices, and on systems which support it, device metadata can be added in the form of tags. Currently only Linux is supported via udev properties, and you can view all available properties for a device by running: udevadm info -q property -n /dev/sda

Key DiskIO metrics to use for monitoring

Some of the important DiskIO metrics that you should proactively monitor include:

  • name (device name)
  • serial (device serial number)
  • reads (integer, counter)
  • writes (integer, counter)
  • read_bytes (integer, counter, bytes)
  • write_bytes (integer, counter, bytes)
  • read_time (integer, counter, milliseconds)
  • write_time (integer, counter, milliseconds)
  • io_time (integer, counter, milliseconds)
  • weighted_io_time (integer, counter, milliseconds)
  • iops_in_progress (integer, gauge)
  • merged_reads (integer, counter)
  • merged_writes (integer, counter)

Notes on the metrics collected

reads & writes: These values increment when an I/O request completes.

read_bytes & write_bytes: These values count the number of bytes read from or written to this block device.

read_time & write_time: These values count the number of milliseconds that I/O requests have waited on this block device. If there are multiple I/O requests waiting, these values will increase at a rate greater than 1000/second; for example, if 60 read requests wait for an average of 30 ms, the read_time field will increase by 60*30 = 1800.

io_time: This value counts the number of milliseconds during which the device has had I/O requests queued.

weighted_io_time: This value counts the number of milliseconds that I/O requests have waited on this block device. If there are multiple I/O requests waiting, this value will increase as the product of the number of milliseconds times the number of requests waiting (see read_time above for an example).

iops_in_progress: This value counts the number of I/O requests that have been issued to the device driver but have not yet completed. It does not include I/O requests that are in the queue but not yet issued to the device driver.

merged_reads & merged_writes: Reads and writes which are adjacent to each other may be merged for efficiency. Thus, two 4K reads may become one 8K read before it is ultimately handed to the disk, and so it will be counted (and queued) as only one I/O. These fields lets you know how often this was done.

For more information, please check out the documentation.

Project URL   Documentation

Related resources

InfluxDb-cloud-logo

The most powerful time series
database as a service

Get Started for Free
Influxdbu

Developer Education

Training for time series app developers.

View All Education