How to spin up the TICK Stack in a Kubernetes instance
In this webinar, Jack Zampolin will provide you with a step-by-step process to spin up the TICK stack in a Kubernetes instance.
Watch the Webinar
Watch the webinar “How to spin up the TICK Stack in a Kubernetes instance” by clicking on the download button on the right. This will open the recording.
Here is an unedited transcript of the webinar “How to spin up the TICK Stack in a Kubernetes instance.” This is provided for those who prefer to read than watch the webinar. Please note that the transcript is raw. We apologize for any transcribing errors.
• Chris Churilo: Director Product Marketing, InfluxData
• Jack Zampolin: Developer Evangelist, InfluxData
Jack Zampolin 00:02.477 Okay, good morning everyone. The topic this morning is deploying the TICK Stack on Kubernetes. My name’s Jack Zampolin. I’m the developer evangelist over at InfluxData. So I do demos. I do some customer-facing stuff and sort of work with a lot of different technologies and the ways that they interact with Influx. So let’s dive into it. So there’s a repository associated with this lecture or talk. It’s tick-charts. And I’m just going to drop the link down here in the Q&A in the chat, I guess. And tick-charts is a series of deployment files for Kubernetes. So I guess if you’re here, you’re probably familiar with Kubernetes. But why did I write these? So I’ve found myself trying to help people deploy the full Tick Stack easily. As many of you are probably aware, it’s difficult to deploy a lot of different services and get them all connected. And using the deployment tools that are out there makes that much easier. So I wanted to have an easy way for anyone who is on Kubernetes to be able to spin off our stack of products and see what sort of metrics and charts and graphs they can pull out of their existing infrastructure with very little effort. So that goes to why Kubernetes. This is a personal interest of mine. Since I’ve started learning development, I really enjoyed Docker, and I found that it’s very easy to do development on. And then also with these more modern frameworks like Kubernetes and Mesos, very practical to do developments with, too. And we’ll see sort of how easy it is to deploy a full-stack monitoring solution that can scale to tens of thousands of machines with just a few commands, so.
Jack Zampolin 02:17.043 And why TICK monitoring on Kubernetes. So there’s a couple of other options out there. Obviously, depending on your cloud provider, you’re going to be presented with a number of different options as far as monitoring. AWS natively hooks into Cloud Watch, I believe. They also use Heapster, which is kind of a standard Kubernetes API for metrics. Google Compute Engine uses Stack Driver. And if you’ve got a custom Kubernetes, for instance, you might be using something else entirely. So there’s a lot of options there. So why would we use the TICK Stack? Well, Influx just released a bunch of software around monitoring Kubernetes specifically. There’s just a Telegraf plugin that got added recently to enable Telegraf to run as a Daemon Set, poll the kubelets, and poll namespace all the way down to pod-level data for all of your deployments. Which I find really nice being able to get a full namespace view of things and then be able to drill down into those pod-level statistics really helps with debugging especially these larger complex applications.
Jack Zampolin 03:38.939 So what about Heapster? Heapster is the default monitoring solution that comes with Kubernetes if you spin it up. Heapster also uses InfluxDB as the back end. So why do we need a full custom monitoring solution for Kubernetes if Heapster already exists and they’ve already built it? Well, there’s a couple of issues there. Heapster, especially its scale is not great, and I know there’s a number of people who have been kind of bitten by that. The schema that it writes to Influx is very non-performant. It’s a little gnarly to go in there and connect a Grafana instance to your Heapster. There are some canned dashboards, but as far as extensibility, it’s a little bit tough. So also there hasn’t been a whole lot of activity on the project recently. So that in combination with the new features that we’ve released maybe want to, sort of, spin this up and see how easy it was. So let’s get into an architectural overview.
Jack Zampolin 04:49.470 So I’ve sort of explained what it is and why, and now let’s see how it works. So we’re going to be talking about Kubernetes concepts here. And I’m going to have a diagram and then I’m going to do a live coding session and spin it up and then take questions. So these are the different pieces we’re talking about here. You can see the hosts down at the bottom. Those are the virtual or physical hosts that your Kubernetes cluster is running on top of. And then that line there I’ve drawn to represent sort of the Kubernetes control plane. And then on top of that we have all of our software-defined services. So services, persistent volume claims, deployments, all of those objects are represented here. So we’ve got all of the different services and the way they’re going to look as we page through here. Cool. So when you start off, the first thing you want to deploy is Influx. It’s going to sit there and catch all of the data from your collectors and it’s going to serve data to any visualization tier that you’re going to have.
Jack Zampolin 06:02.109 You need a service to expose it to the cluster as a whole. The Influx service is not a load balancer, it’s just a node port. And it’s an individual instance. And then we also use a persistent volume claim to persist data from Influx. The next thing that we deploy is Chronograf. Chronograf needs to talk to Influx. And Influx needs to send things to Chronograf. So this also has state to hold alerts. So we need a persistent disk under that. So when the Kapacitor gets deployed, the first thing it’s going to do is look for the InfluxDB where we’ve told it to look for the InfluxDB and subscribe to it. The next thing that we need to deploy is Telegraf. So we’re going to run Telegraf in two different ways here. One is a Daemon Set. Daemon Sets just run a program on each of the host instances. And these Telegraf instances have a few host volumes mounted to them to pull those host-level statistics like CPU, memory, that kind of thing. Those Telegraf instances also continually pull their local kubelets to pull the Kubernetes data out.
Jack Zampolin 07:19.172 And if anyone’s looking, we’re on the tick-charts section, and there’s a number of slides. We’re on number slide 11 if this isn’t switching through with everyone. So once the Telegraf instances are deployed, they’re going to start collecting data and sending it to InfluxDB. And that data is going to flow through the subscription that Kapacitor created on Influx over to Kapacitor. So once that happens, we’ve got most of the nuts and bolts of the stack ready to go. And the next thing that we need to deploy is the visualization tier, which is Chronograf. Chronograf, if you’re familiar with it, is the InfluxDB native visualization engine. But a few will use Grafana with Influx. But we’ve written our own as well. We just launched a new version of Chronograf that has a graphic user interface for writing TICK Scripts. It’s got a lot of canned dashboards, especially ones around Kubernetes that you’ll see here soon. And there’s a lot of very cool features. So that’s what we’ll be using to visualize. That consists of a service, a deployment, and another persistent volume claim. Chronograf does need a little bit of state to run.
Jack Zampolin 08:47.396 And then optionally, an Ingress. If you need to hook into, let’s say, an Nginx ingress or you’re using Google Cloud or ELB Ingres to provide a fully-qualified domain name to your—to this service. So that would be how you hook this into other existing infrastructure. So once that happens, you need to connect Chronograf to InfluxDB and Kapacitor. We’ll be doing that at the end of the demo. And then the other piece of this is there’s a floating Telegraf that polls InfluxDB. We have plugins for Nginx, Redis, those are polling-type plugins for pulling metrics off of other pieces of your infrastructure. Telegraf also offers a StatsD server, so if you have application metrics that are being outputted in StatsD format, you can send them to Telegraf there, and they’ll be persisted in Influx. So that floating Telegraf offers you a lot of flexibility for metrics collection and polling as well. So that’s the full stack there. And once we deploy that, we can, as I said, send application metrics to either Telegraf, StatsD, and you can send a variety of different protocols there, or InfluxDB, line protocol only. So this has a built-in way to manage ingest. It’s got the full stack setup and all connected, and the nice thing about Kubernetes, and Helm in particular is that it allows you to set some very sensible defaults on this and deploy it entirely of its own namespace so it’s sandboxed. And all it does is observe the behavior of your existing system, so the nice way of this way of deploying the TICK Stack is it gives you all of the hooks into your existing system, in an easy way to turn it on and off. So now we get to go to the live demo, so I’m going to share my screen here.
Jack Zampolin 11:09.657 Today we’re going to be installing the full TICK Stack from InfluxData on a Kubernetes cluster. I find this a quick and easy way to get a different look at your infrastructure. If you’ve got an existing Kubernetes instance running, it’s an easy way to peek into it from a different angle than your current monitoring setup. And it’s pretty easy to setup, so let’s get started. I’m in the project directory here and to follow along with this, you’re going to need a couple of things: a running Kubernetes cluster with a kubectl command line tool configured to work with it, and then you’re also going to need something called Helm. Helm is the Kubernetes package manager. Think of it like apt-get for your Kubernetes cluster. Consists of two pieces: a CLI, the helm-cli, and then there’s a server component called Tiller that runs in your cluster and helps manage deployments. You’re also going to need to download this project repo, JackZampolin/TICK charts and this is just the helm charts for the InfluxData TICK stack. We’re also starting with a completely empty Kubernetes instance. Here’s the namespace that we’re going to be installing into, TICK. And then we’ve also got nothing in the default namespace, so it’s ready to go. There could be stuff running in here. And this demo spins up entirely in its own name space. It doesn’t affect the rest of your cluster. So let’s go ahead and get started.
Jack Zampolin 12:37.815 First, we need to package each of these charts. Package makes TGZ files that are easier for shipping up to the server and then we’re going to install these, one by one. I’m going to install Chronograf first. I’m installing all these in namespace TICK, but you can install them in any namespace you want. I think this demo is actually set up by default to run the TICK namespace. You would have to make a couple of small configuration changes. And we can see here that we’ve created the service and the deployment for Chronograf and it looks like it’s coming up and our external load balancer is pending. So let’s go ahead and get some data behind this Chronograf instance. We’re going to install Telegraf next. So, here in the Telegraf installation, we’ve got a deployment with a single Telegraf instance and we’ve also got Telegraf running as a Daemon Set. Daemon Set’s going to gather a bunch of post-level statistics and that single instance is going to act as a listener for StatsD points and it’s also going to be polling the Kubernetes API and InfluxDB. Next, we need to install InfluxDB itself. Okay, and that’s just a service and a deployment. There’s a config map associated there as well if you want to change some InfluxDB configuration. And finally, we need Kapacitor. Okay, now that we’ve installed everything let’s go look and see what we’ve installed. It looks like everything’s come up properly, so let’s just take a peek here. Here’s our Telegraf Daemon Set, our Chronograf, InfluxDB, Kapacitor, and Telegraf single deployment, and the associated pods.
Jack Zampolin 15:04.095 And if we go take a peek over at the services, we should see the same. A service for each of the products here. So, the one that we really care about is Chronograf, so let’s go pop over to that load balancer and connect InfluxDB. Grab the InfluxDB connection string from here, but it’s release. So we named the release InfluxDB-chart name and then to address it in the cluster we need to say, .namespace and then whatever port it’s running on. So I’m going to go right ahead and type this in, over here. Cool. And right off the bat, we see our host list. So this gives us host-level statistics from each of the nodes that are running Kubernetes. There’s also this Kubernetes dashboard here too. So this is pod-level statistics for CPU, memory, and network ingress and egress. One other thing that we can do is connect that Kapacitor instance that we just created. Awesome. And just to test and make sure it’s working properly let’s go ahead and configure this Slack integration down here. I’ve already got a Slack bot that I use for testing like this, so we’re going to go ahead and use that. Okay, and let’s see if we can watch this work on Slack. So we should see that test message pop up right down there. Awesome.
Chris Churilo 16:58.720 Okay, in the meantime we have a question from Vinnie.
Jack Zampolin 17:00.322 Okay.
Chris Churilo 17:01.263 And he asks, “Since there’s one instance of Influx, is it configured through Kubernetes to autoscale and will this automatically hook itself up correctly?
Jack Zampolin 17:09.308 Yeah, that’s an excellent question. So the only element of this that will scale seamlessly is the single Telegraf instance. InfluxDB, in order to scale, needs clustering support. I will be releasing a PetSet for InfluxDB Enterprise to sort of provide that some time soon. I haven’t finished that up yet. But it is difficult to scale the persistent datastores within Kubernetes, so I haven’t configured Influx to autoscale. You can also autoscale the Chronograf web dashboard. Currently, I’m only running one instance of each. There are expose configuration parameters to run multiple instances of each of these. And because they’re behind a service, they’ll be low balanced correctly. Does that answer your question, Vinny? Awesome.
Chris Churilo 18:09.702 And then we have three questions in the Q&A panel.
Jack Zampolin 18:14.509 Oh, awesome!
Chris Churilo 18:15.658 Should I show them–?
Jack Zampolin 18:17.609 Yeah, I think I can go ahead and do that. So Adam asks, “This looks great on Kubernetes. I believe InfluxDB is a package on the DC/OS Universe. Are there plans to expand packages available on Universe to include other aspects of the TICK Stack, or maybe a TICK Stack package?” I did talk to someone at work about this the other day. We don’t have a ton of experience in-house in the DC/OS Universe. But once I get those nailed down on Kubernetes, I want to get the Influx headset taken care of before I move on. I really do want to try to take a crack at trying to deploy on Mesos and get a full TICK Stack running there. So yeah, it’s something we’re thinking about and something we’d really like to do. Scott, “Thank you. I really enjoyed this. Definitely going to be reviewing this video once it gets posted. Also, the Slack denigration will be really interesting for me.” Cool. I hope we got you taken care of. And then Adam asks, “Is Kapacitor performing streaming analytics, or is it polling periodically to check for alarming conditions?” Great question. So by default, Kapacitor creates a subscription on InfluxDB, and Influx will forward any points it gets over to Kapacitor. So that is actually streaming analytics by default. There’s a couple of different ways you can run Kapacitor tasks. The most Kapacitory is streaming. So the data’s streaming in. And then you can also run bash task. So this would be more traditional ETL stuff. Let’s say you want to grab a full month from one database, do some munging on it and then forward that over to another database. Kapacitor would be great for that. So it is performing streaming analytics in this configuration, but it does both.
Chris Churilo 20:13.934 So, Jack, what you presented looks so surprisingly easy, but I know you put a lot of effort into this. Maybe you can share with everybody. What are some of the traps that they can avoid when they try to do this themselves?
Jack Zampolin 20:29.434 Yeah, so what I found, this is more on the editorial side, when you’re doing stuff for Kubernetes and you’re first learning it, it’s easy to just write these deployment files by hand. And you can say I want this service and I want all this stuff. But if you want to do something repeatable, you very quickly get into a place where you want something like a templating engine. And you don’t want to have to be manually editing these deployment files every time. You just want to put in the pieces that are important to you. And if you’ve worked with Kubernetes, you’ve probably copy-pasted deployment files and then just fixed a couple of values and then deployed a new service. I mean, a node.js web app that runs on Port 3000 is not very complicated to deploy. And the horizontal pod autoscaler that needs, and the service it needs, can easily be templated too. So what I found very helpful for this was using Helm. The templating engine they provide is great for this, and it also makes tearing down and building the deployments very quick. So it shortens your iteration cycle. That’s one thing I would tell people to help out. So did that answer the question, Chris?
Chris Churilo 21:56.102 Yeah. Anything else that you think people should know about? Because you make it look so effortless.
Jack Zampolin 22:01.954 Yeah. That’s a big part of it. Yeah, and then also, the other part is understanding the applications you’re deploying. I’ve been working with the Influx TICK Stack for a long time. I know all the ports for all these containers just off the top of my head and a lot of the configuration perimeters just kind of makes sense. So if you’re learning how to do these Kubernetes deployments, pick something you know well and then expand into other areas.
Chris Churilo 22:36.122 That’s a great bit of advice. So we will be also posting a blog that Jack’s written that also gives a lot of the details of what he presented to everybody today. But I do want to give everyone just maybe a couple more minutes to think of some questions, allow you to ask Jack some—
Jack Zampolin 22:53.328 Yeah. Absolutely, yeah, and if anyone has any questions I’m more than happy to answer them.