Google Cloud's operations suite (formerly Google Stackdriver) is a cloud computing systems management service offered by Google that provides performance metrics for both Google Cloud and AWS cloud environments.
Why use the Google Stackdriver Telegraf Plugin?
The Google Stackdriver Telegraf Plugin allows you to query over 1,500 metrics, logs, and traces from Google Cloud and Amazon Web Services using the Cloud Monitoring API v3 to store in InfluxDB. You can also add metrics from your infrastructure, networks, your applications and more so you can have a full view of your entire stack.
Please note that this plugin accesses APIs which might incur costs to you.
How to monitor the Google Stackdriver Telegraf plugin
Once you install InfluxDB and Telegraf, you can configure the Google Stackdriver Telegraf plugin for configurations like:
- Maximum number of API calls to make per second
- Collection delay
- TTL for cached list of metric types
You can also add filters to reduce the number of time series matched.
Key Stackdriver metrics to use for monitoring
Some of the important Stackdriver metrics that you should proactively monitor include:
- Google Cloud metrics, for Google Cloud services such as Compute Engine and BigQuery
- Kubernetes metrics, for Google Kubernetes Engine (GKE)
- Istio metrics, for Istio on Google Kubernetes Engine
- Anthos metrics, for Anthos clusters on VMware
- Metrics for Cloud Monitoring and Cloud Logging agents, Amazon Web Services, open-source, and third-party applications
- Agent metrics, for VM instances running the Monitoring and Logging agents
- AWS metrics, for Amazon Web Services such as Amazon Redshift and Amazon CloudFront
- Knative metrics, for Knative components
- External metrics, for open-source and third-party applications