Kubernetes Monitoring Integration
InfluxData for a Highly Available Kubernetes Monitoring Solution
When simple and easy takes you all the way. Full-stack monitoring, high availability with clustering, and business analytics is what gets you to “Faster time to awesome”.
Data granularity, push and pull mechanisms, flexible retention periods, data roll-up control, speed, and high availability all matter when diving deeper into inspecting, understanding and managing your application environment. Most production environments don’t have a singular approach for the deployment and management of their applications. Therefore, you should consider a platform that can handle variances, custom implementations, and business uniqueness, while facilitating the need for fast release cycles and evolution over time.
InfluxData’s InfluxEnterprise is the perfect long-term store for time series data in the form of both metrics and events. You can use this data to monitor all aspects of your full-stack including Kubernetes.
Kubernetes is an Open Source Platform designed to automate the deployment, scaling, and management of containerized applications. Using Kubernetes, you can quickly and efficiently respond to customer demand with fast and reliable application deployment, scale-out and new feature rollout. You can even restrict hardware usage to certain resources. Moving from a host-centric to a container-centric infrastructure allows for portability and orchestration, and thereby, letting developers ‘cut the cord’ to physical and virtual machines.
In a simplified view, a Kubernetes architecture comprises of an API server orchestrating the desired containerized application cluster deployments, abstracted as Objects. A deployment object represents a “record of intent”; the Kubernetes system will constantly work to ensure that the object exists as specified in its desired declared state. The Kubernetes architecture components are the master node (API server), nodes and addons.
- The master node runs the control plane with a controller manager, cluster metadata store, scheduler, and a front-end API Server. The master node communicates with the clusters via the kubelet process which runs on each node in the cluster. It also communicates via the API server’s proxy functionality to any node, pod or service.
- Nodes are not inherent to Kubernetes. They are worker machines, part of a pool of virtual or physical resources. The Kubernetes nodes components are kubelet agents (ensures containers are running in a pod encapsulation), kube-proxy and the container runtime.
- Addons are pods and services that put into effect cluster features. Pods are inherent to Kubernetes—they are disposable, created, destroyed, and replicated according to the deployment object rules. They encapsulate an application container, storage resources, unique IP address and options to govern how containers should run. As Pods are ‘mortal’, the same applies to their assigned IP addresses; therefore, there is no reliable IP address to communicate with Pods of a cluster. Kubernetes use Service abstraction to define a logical set of Pods and a policy by which to access them (micro-service).
As is the case for any modern platform, Kubernetes clusters must be monitored. In order to accomplish this, the master node, nodes, containers, pods, services and the characteristics of the overall cluster must be monitored. Timely actions are of the essence in today’s fast-paced competitive market with no tolerance for latency, error, congestion, and quality degradation. Making matters worse, networks are plagued with security threats and resource misuse issues. Anticipation is the answer to staying ahead of the game and requires visibility into what is going on in your production environment. This is important not only at the time when an issue may arise since historical insights will prove useful in forecasting. With this in mind, a completely scalable and highly available solution for collecting and storing time series data used for full-stack monitoring will become important.
Kubernetes comes with some built-in pipeline metrics:
- Resource, CPU and memory usage metrics via the Metrics API. The Metric server collects metrics from the Summary API, exposed by the Kubelet on each node.
- A richer set of hardware and OS metrics can be collected via the Prometheus Node Exporter.
However, this is not enough, as this represents only a small part in the grand scheme of full-stack monitoring. Full-stack monitoring comprises of monitoring your infrastructure, applications, services, nodes, and even traffic flow.
Prometheus monitoring is native to Kubernetes and can collect and store metrics, in a single node, by pulling the exposed data. It doesn’t support push-based collection which leaves events at irregular intervals unnoticed leading to a whole spectrum of escalated service issues that could be avoided with timely action. Also, Prometheus is not suited for high sampling rates and long data retention periods which may be required for historical reporting and forecasting purposes. Although it is a nice first step, you will sooner or later hit the hard limit of a single-node storage and look for support for high availability clusters.
In summary, Kubernetes comes with APIs to expose logs and resource metrics from their objects that can be pulled and stored for monitoring. This, in turn, is used to evaluate application performance and resource usage by examining containers, pods, and nodes. However, there is still a need to complement this built-in monitoring so that Devops and netops professionals are well-equipped to step ahead of trouble. Additionally, the need for scalable storage and real-time data analytics is pressing, even more in environments where containers live just for a short period, and are continuously being generated by orchestrators like Kubernetes.
Professionals need to constantly reevaluate which tools they should use to guide them in their decisions regarding their Kubernetes implementation effectiveness. Richer, more granular metrics and events from the orchestrator, the infrastructure and the application level can be collected via integration with best-in-class solutions. The next question is: how do we normalize and store everything for analysis and visualization, consistently, in real-time, with rollup control and the appropriate retention policies?
InfluxData’s Time Series Platform makes this job easy by providing full-stack monitoring and setting the foundation as a long-term metric and event store. And as such, it can be used with real-time analytics and visualization tools.
InfluxData Support: Kubernetes Monitoring Architecture
InfluxData supports pull and push metrics and events from multiple sources, including directly from Kubernetes nodes, the master node and Prometheus exposed endpoints. Your investment in Prometheus can be preserved since InfluxData can easily read Prometheus endpoints and requires no code changes. InfluxData’s InfluxEnterprise provides the ability to store all collected data in clusters for scalability and availability allowing long periods of retention and high sampling rates.
See below Chronograf query listing of metrics from Prometheus endpoints.
For a native Kubernetes deployment, InfluxData’s metrics collection agent Telegraf can be installed as a DaemonSet in every Kubernetes node. Telegraf can scrape kube-state-metrics from the Kubernetes master as well as metrics from applications and containers to store them in InfluxDB.
See below a diagram representation of native Kubernetes monitoring with Telegraf deployed as DeamonSet on each node sending data to InfluxDB HA cluster for long-term storage.
See below pre-canned dashboard with InfluxData datasource, Chronograf, for Kubernetes monitoring.
Additional Telegraf plugins are available for baseline monitoring, e.g.:
- Network: ping plugin, Socket listener plugin, HTTP plugin, Network response plugin
- System: CPU plugin, Disk plugin, DiskIO plugin, Mem plugin, Net plugin, Netstat plugin
- Application: Apache HTTP Server plugin, Apache Tomcat plugin, Docker plugin, Kubernetes plugin
Also available from InfluxData is a vast list of plugins that collects metrics and events from tools, such as Prometheus, Nagios, and Icinga. There are also plugins that can collect data for traffic and flow monitoring.
With InfluxData TICK Stack fully installed, collected data can be sent to Kapacitor for alert processing and data transformation, and visualized via Chronograf datasource.
Kubernetes High Availability
How InfluxData Makes It Easier to Work with Kubernetes
InfluxData has added Kubernetes-specific capabilities to make it easier for its users to work with Kubernetes:
- Helm Charts for Faster Node Deployment kube-influxdb is a collection of Helm Charts for the InfluxData TICK Stack to monitor Kubernetes with InfluxData.
- Native Kubernetes Operators
A Kubernetes operator to manage InfluxDB instances. This Operator is built using the Operator SDK, which is part of the Operator Framework and manages one or more InfluxDB instances deployed on Kubernetes.
- Telegraf Kubernetes input plugin
The Kubernetes input plugin talks to the kubelet API using the /stats/summary endpoint to gather metrics about the running pods and containers for a single host.
Telegraf Kubernetes input plugin
- Telegraf, InfluxDB, and Grafana Kubernetes apps
Kubernetes apps are prepackaged applications that can be deployed to Google Kubernetes Engine in minutes.
Telegraf InfluxDB and Grafana package
- Telegraf plugin for Prometheus format
- High Availability (HA) and scalability of monitored data
Large volume of metrics and events can be preserved in InfluxDB storage clusters allowing long-term policy retention together with high data granularity and high series cardinality.
InfluxData—Monitor More than Just Kubernetes
InfluxDB is the market choice for time series data that can power best-in-class solutions that address the needs of DevOps, netops, security, workflow automation, and business intelligence. Its vibrant community embraced the InfluxDB open source project for its ease of deployment and operation, speed, and support for push and pull data collection. In addition, its feature-rich capabilities allow for downsampling, rollups and setting long retention periods of historical data which are important to the users. InfluxData core capabilities meet the needs of both ends of the spectrum: simple single-node deployment to large, complex high availability clusters and distributed resource-constrained environments.
In summary, to differentiate InfluxData offers these qualitative and quantitative advantages:
- Visibility – Telegraf can collect metrics and events via the 200+ collector agents to provide visibility across your infrastructure, clusters, hosts, VMs, data stores and container orchestrators like Kubernetes.
- Clustering and High Availability (HA) – InfluxEnterprise provides clustering, an ideal solution for long-term storage to complement Prometheus’ ephemeral storage.
- Security and Alerting – For Enterprises that need advanced features such as fine-grained security, real-time analytics and alerting, InfluxData provides the perfect complementary stack to Prometheus.
- Predefined Dashboards – Pre-canned dashboards are available in Chronograf for monitoring Kubernetes, applications, networks, and systems.
- Handle both Metrics and Events – Prometheus is good for scraping (or Pull) Metrics but not a good solution for Events (Push). InfluxData enables enterprises to monitor both metrics and events and furthermore blends them with business context as needed.
- Real-time analytics: InfluxData real-time processing engine, Kapacitor, allows for raw, aggregated or transformed metrics to be used for alerting, workflow automation, machine learning, and business intelligence.
With a motivated global developer community, InfluxData is continuously pushing the edge of times series storage and processing. It delivers a solution that meets the requirements of even the most sensitive and demanding sectors like finance and service providers.
Fundamentally, InfluxData is an enterprise-grade platform for the collection, storage and real-time analysis of time series data. This data’s value is in its ability to ensure the availability and reliability of host-based or container-based infrastructure, services, and applications. Business metrics are also collected in time series data format and applied to measure business performance, which will drive investment decisions. The combination of Kubernetes with InfluxData provides the application environment foundation to support organizations on their journey to growth and modernization.
Content on Kubernetes and InfluxData
- Webinar: How InfluxData makes Kubernetes an even better Master of its components through monitoring: This webinar shows you how to use InfluxData to help Kubernetes orchestrate the scaling out applications by monitoring all components of the underlying infrastructure.
- Webinar: Service Discovery, pull and Kubernetes: How Kapacitor’s new Service discovery and scraping code will allow any service discovery target that works with Prometheus to work with Kapacitor.
- Blog: Draft for Kubernetes – A Prototyping Tool