How to Monitor Kubernetes K3s Using Telegraf and InfluxDB Cloud

Navigate to:

This article was originally published in The New Stack and is reposted here with permission.

A Helm chart can simplify our lives and enable us to see what is happening with our K3s cluster using an external system.

Lightweight Kubernetes, known as K3s, is an installation of Kubernetes half the size in terms of memory footprint.

Do you need to monitor your nodes running K3s to know the status of your cluster? Do you also need to know how your pods perform, the resources they consume, as well as network traffic? In this article, I will show you how to monitor K3s with Telegraf and InfluxDB Cloud.

I run a blog and a few other resources on Kubernetes. Specifically, these run in a cluster of three nodes in DigitalOcean that run K3s, and I use Telegraf and InfluxDB to monitor everything.

I’m going to demonstrate how to monitor the cluster to make sure that everything is running as expected and how to identify something if it is not.

To monitor the cluster I use two components:

InfluxDB Cloud: It’s ideal to do monitoring from the outside because if we do it from the inside and the node goes down, then so does the monitoring solution, and that doesn’t make any sense. You can get a free InfluxDB account here: https://cloud2.influxdata.com/signup/

Next, we need to install a Helm chart from Telegraf, specifically this one, because it does not have Docker engine support, which if you run K3s, doesn’t need it.

Let’s do it…

Configuring InfluxDB Cloud

The first thing we must do is create an account in InfluxDB Cloud. Next, we go to the Data section, click on Buckets, and then on Create Bucket.

Configuring InfluxDB Cloud-Load Data

Name the bucket and click on Create.

Name the bucket and click on Create

This is what our list of buckets should look like. After successfully creating the bucket, we create an access token to be able to write data to that bucket. To do that we go to the Tokens tab.

create an access token

In this section, we click on Generate Token and choose the Read/Write Token option.

Generate Token and choose the Read-Write Token option

We specify a name, choose the bucket we want to associate with this token, and click on Save.

choose the bucket to associate with this token -save

Once this is done, the new token appears in the token list.

new token appears in the token list

To finish this part, we are going to need our Org ID and the URL to point our Telegraf to.

The Org ID is the email you used to sign up for InfluxDB Cloud. I get the URL from the address bar. In my case, when I set up my InfluxDB Cloud account, I chose the western United States. So my URL looks like this:

https://us-west-2-1.aws.cloud2.influxdata.com

Now that we configured InfluxDB Cloud, we need to configure the nodes.

As I mentioned above, we are going to use a Helm Chart. I modified this Helm Chart to adapt to K3s, because by default it tries to monitor Docker, which isn’t used in this Kubernetes distribution.

If you don’t have Helm installed, you can install it by running this command:

$ curl https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 | bash

Once installed, download the values.yaml file here.

You can also grab the raw file and download it directly to the master node with a wget.

$ wget https://raw.githubusercontent.com/xe-nvdk/awesome-helm-charts/main/telegraf-ds-k3s/values.yaml

Now, we have to modify this file a bit. We need to open it and modify the Output section. By default the file looks like this:

## Exposed telegraf configuration
## ref: https://docs.influxdata.com/telegraf/v1.13/administration/configuration/
config:
  # global_tags:
  #   cluster: "mycluster"
  agent:
    interval: "10s"
    round_interval: true
    metric_batch_size: 1000
    metric_buffer_limit: 10000
    collection_jitter: "0s"
    flush_interval: "10s"
    flush_jitter: "0s"
    precision: ""
    debug: false
    quiet: false
    logfile: ""
    hostname: "$HOSTNAME"
    omit_hostname: false
  outputs:
    - influxdb:
        urls:
          - "http://influxdb.monitoring.svc:8086"
        database: "telegraf"
        retention_policy: ""
        timeout: "5s"
        username: ""
        password: ""
        user_agent: "telegraf"
        insecure_skip_verify: false
  monitor_self: false

But since we are going to use InfluxDB Cloud, we must make some adjustments. The modified version will look something like this:

## Exposed telegraf configuration
## ref: https://docs.influxdata.com/telegraf/v1.13/administration/configuration/
config:
  # global_tags:
  #   cluster: "mycluster"
  agent:
    interval: "1m"
    round_interval: true
    metric_batch_size: 1000
    metric_buffer_limit: 10000
    collection_jitter: "0s"
    flush_interval: "10s"
    flush_jitter: "0s"
    precision: ""
    debug: false
    quiet: false
    logfile: ""
    hostname: "$HOSTNAME"
    omit_hostname: false
  outputs:
    - influxdb_v2:
        urls:
          - "https://us-west-2-1.aws.cloud2.influxdata.com"
        bucket: "kubernetes"
        organization: "[email protected]"
        token: "WIX6Fy-v10zUIag_dslfjasfljadsflasdfjasdlñjfasdlkñfj=="
        timeout: "5s"
        insecure_skip_verify: false
  monitor_self: false

If we need to adjust other values, like the collection interval, you can do it by changing the interval value. For example, I don’t need the data every 10 seconds, so I changed it to 1 minute.

Now we come to the moment of truth! We are going to install the Helm Chart and see if everything works as expected. Depending on your K3s configuration, you might need to pass the cluster configuration as a KUBECONFIG environment variable.

$  export KUBECONFIG=/etc/rancher/k3s/k3s.yaml

Once that’s done, we’re going to add the Awesome-Helm-Charts repo.

$ helm repo add awesome-helm-charts https://xe-nvdk.github.io/awesome-helm-charts/

Then we update the content of the repos that we configured.

$ helm repo update

Finally, we’ll install the repo, passing it the configuration we just modified in the values.yaml file

$ helm upgrade --install telegraf-ds-k3s -f values.yaml awesome-helm-charts/telegraf-ds-k3s

The terminal should return something similar to this:

Release "telegraf-ds-k3s" does not exist. Installing it now.
NAME: telegraf-ds-k3s
LAST DEPLOYED: Fri Jun 25 22:47:22 2021
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
To open a shell session in the container running Telegraf run the following:

- kubectl exec -i -t --namespace default $(kubectl get pods --namespace default -l app.kubernetes.io/name=telegraf-ds -o jsonpath='{.items[0].metadata.name}') /bin/sh

To tail the logs for a Telegraf pod in the Daemonset run the following:

- kubectl logs -f --namespace default $(kubectl get pods --namespace default -l app.kubernetes.io/name=telegraf-ds -o jsonpath='{ .items[0].metadata.name }')

To list the running Telegraf instances run the following:

- kubectl get pods --namespace default -l app.kubernetes.io/name=telegraf-ds -w

This output shows that the Helm chart deployed successfully. Keep in mind that this is a DaemonSet, which automatically installs the Helm Chart on each of the nodes in this cluster.

To check that everything is running properly use the following command:

$ kubectl get pods

We see that our pod is alive and kicking.

NAME                    READY   STATUS    RESTARTS   AGE
telegraf-ds-k3s-w8qhc   1/1     Running   0          2m29s

If you want to make sure that the log is working as expected, then run:

$ kubectl logs -f telegraf-ds-k3s-w8qhc

The terminal should output something like this:

2021-06-26T02:55:22Z I! Starting Telegraf 1.18.3
2021-06-26T02:55:22Z I! Using config file: /etc/telegraf/telegraf.conf
2021-06-26T02:55:22Z I! Loaded inputs: cpu disk diskio kernel kubernetes mem net processes swap system
2021-06-26T02:55:22Z I! Loaded aggregators:
2021-06-26T02:55:22Z I! Loaded processors:
2021-06-26T02:55:22Z I! Loaded outputs: influxdb_v2
2021-06-26T02:55:22Z I! Tags enabled: host=k3s-master
2021-06-26T02:55:22Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"k3s-master", Flush Interval:10s

Everything seems fine, but now comes the moment of truth. We go to our InfluxDB Cloud account, navigate to the Explore section, and we should see some measurements and, of course, some data when selecting the bucket.

navigate to the Explore section

As you can see, this process isn’t as complicated as it might seem. The Helm chart simplifies our lives and from now on we can see what is happening with our cluster using an external system.