How to Use InfluxData in IoT Solution
David Simmons is the IoT Developer Evangelist at InfluxData, helping developers around the globe manage the streams of data that their devices produce. He is passionate about IoT and helped to develop the very first IoT Developer Platform before “IoT” was even ‘a thing.’ David has held numerous technical evangelist roles at companies such as DragonFly IOT, Riverbed Technologies, and Sun. He studied Computer Science at the University of New Mexico and has a BA in Technical Writing from Columbia University and a BS in Biological Sciences from UC Santa Barbara.
In this webinar, David Simmons of InfluxData will show you how to collect some basic metrics for monitoring your IoT solution.
Watch the Webinar
Watch the webinar “How to use InfluxData in IoT Solution” by clicking on the download button on the right. This will open the recording.
Here is an unedited transcript of the webinar “How to use InfluxData in IoT Solution.” This is provided for those who prefer to read than watch the webinar. Please note that the transcript is raw. We apologize for any transcribing errors.
• Chris Churilo: Director Product Marketing, InfluxData
• David Simmons: Senior Developer Evangelist, InfluxData
David Simmons 00:00.951 Great. So now that we’re finally getting started, my name is David Simmons. I’m the Senior Developer Evangelist here at InfluxData, and I’ll be talking about collecting IoT data in InfluxDB. And at some point, I’m going to give a demo, and so let’s hope that the demo works as always, right? So a little bit of background on Influx. It was founded in 2013. I’m going to go really quickly through this because it’s not really relevant to the actual topic, but I just want to give you a little background of where Influx comes from. It was founded in 2013, basically, to deliver a platform for metrics and events, a Time Series Database. The guiding principles which I have found to be throughout all of the products is developer happiness, ease of development, and scale out, and most importantly time to value. It can be very quick to get things from inception to actual deployment. So far, there are about 70,000 or more active servers, and we’ve got over 300 customers in our cloud and enterprise offerings. All the software is open source. It’s free open source. You can download it and run it to your heart’s content right now. If you want some of the additional things that we offer, under our enterprise offering if you’d like us to manage the whole thing under our cloud offering, then come talk to us, and we can get you started on that as well. So why is this important for IoT? Well, basically, it’s because of the explosion of IoT and sensors on the network. Up to now, there have been things like laptops, and desktop computers, and servers on the network. Machines are beginning to come online. Wearables have come online. And what’s coming is this tidal wave of connected sensors. Basically, anything and everything that senses anything is going to be connected to the Internet. And that’s why they call it the Internet of Things.
David Simmons 02:10.024 It’s looking to be about 20 plus billion devices by 2020. That was actually an early estimate. Follow on estimates have been as high as 50 billion devices by 2020. So there’s really going to be an explosion of devices, and with the explosion of devices comes an absolute explosion of data. So what is time series data, and how does it relate to IoT? Time series data is really pretty simple as a concept. It’s data that is time-bound or time-stamped, so an event at a time. So if you’re looking at things like CPU utilization, disk utilization, or the status of your garage door, it’s what that reading is at a given time. So right now, in my time zone, it is 11:08, so my garage door is currently closed. So that would be an example of a time series data point. For IoT, it’s generally a sensor reading at a time, right? So the temperature reading, the valve flow rate for a given valve in a factory. Again, the garage door could be an example there as well, right? And so really, IoT data is time series data. It’s the definition of time series data, right? Sensor reading at time. So that’s why Time Series Databases like InfluxDB are so critical to deploying IoT projects. So how does InfluxData help with all of this? Well, it’s extremely efficient at data collection using either the line protocol or the Telegraf plugin base architecture. It allows you to do high volume data collection.
David Simmons 04:07.837 IoT is already generating huge volumes of data. And as the number of sensors increases, the amount of data that is coming out of those sensors is going to increase, and it’s going to increase very quickly. So being able to bring that data into a database, analyze it, query that data, and take action based on that data is key to IoT success. Again, the ease of deployment. We talked about this in one of the first slides. It’s very easy to deploy InfluxDB and the entire TICK Stack for data collection and analysis and gives you that low time to value. It’s very quick to get it up, and it’s very quick to start getting return value from your deployment. One of the keys to that is these dashboards and visualization. So it’s really easy to build useful, readable dashboards that can give you insight into your data. And in fact, I published a blog post last week about the kinds of insights that you can get from your data, and I’ll show some of that a little later, okay?
David Simmons 05:21.463 So how you get data from an IoT device to InfluxDB? The first and one of the easiest ways is to use InfluxData line protocol. And the line protocol syntax, it looks a little complicated if you look at it this way, I’ll give you that, but it’s basically a measurement with a series of tags and tag values, and then a series of field and field values. And the chart below sort of gives you a rundown of what those values are, whether they’re required or optional. So the measurement is required, right? That’s the measurement that you’re actually taking. And that’s identified by a string. You can have a tag or a tag set, those are optional, and those are all key value pairs that describe the point. And then you have the field set, or it could be a single field, or a set of fields. And those are required, and you have to have at least one, and that’s that reading. So if we go back to the slide that said time series data is event at time, the field is really the value of the event. And then the last is the time stamp. And that’s optional. If you don’t provide a time stamp, InfluxDB will provide one for you. But if you’re really interested in the time at the sensor, then you’ll provide the time stamp. The key there is that if you’re going to provide a time stamp from the sensor, then you really need to use a time synchronization protocol like MTP to make sure that your server and your sensor have the same idea of time, otherwise things will get a little off.
David Simmons 07:13.339 So the other method is using Telegraf plugins. And there is a huge list of available plugins, and there’s a URL right there. And that’s a very large list of plugins to Telegraf. You can also fairly quickly write your own in Go if that’s your thing. Some of the IoT specific plugins, there’s an AMQP plugin which is basically RabbitMQ. There’s an HTTP listener, which is a Telegraf plugin for line protocol if you want to send your data through Telegraf via line protocol and not directly to InfluxDB using line protocol. And then there’s the MQTT consumer. And that allows you to subscribe to particular MQTT topics, and take those topics and add those messages to InfluxDB. So if you already have an IoT deployment where you’re using MQTT or RabbitMQ to publish your data from your sensors to an MQTT stream, you can very simply grab the Telegraf, plug in for MQTT or RabbitMQ and implement that, and then just begin to subscribe to topics that come out of your IoT sensors, and bring those data points directly into InfluxDB.
David Simmons 08:47.045 So a quick example of line protocol. And this was using the HTTP client library that is available in Arduino and Particle.io and a couple of other IoT device libraries. Basically, you create an HTTP header, and I gave the user agent here as Particle HTTP client. You can call it whatever you want. All headers have to be null terminated. And then you fill out a body to the request. And the HTTP request—you can see that I am again following the line protocol where this is my measurement. Here’s my tags. I’ve got two tags. And I’ve got two readings or values. And I’ll fill that string out. And then it’s a simple HTTP post with that request and a response object and the header object. And I get back a response status. It should be 200 or 204 for success. And that means that those values were successfully inserted into InfluxDB. And it’s really that easy. It’s oftentimes hard to believe that it’s really that easy. But it really is. You can also use the secure HTTPS client library to do secure communications. So you’ll need basic authorization. And that’s base 64 encoded password and username. And the host that you’re going to and what you’re sending. And so you set up your host and your endpoint, which is the URL for your database. Again, you fill out your line protocol string with your measurement, your tags, and your values. And there’s your values. With this, you actually have to say the length of data that you’re sending and tell it where the endpoint is and send your post requests. And again, this goes directly into InfluxDB, and it’s as you can see very easy and straightforward to set this up.
David Simmons 11:19.099 The MQTT example is likewise very simple. You’re going to use the MQTT consumer plugin for Telegraf. Again, Telegraf is a plugin-based architecture for gathering data and pulling it into InfluxDB. So you’ll add the MQTT consumer plugin to your Telegraf instance and fill out the required options. So what server are you going to send it to? What’s your QOS? This is zero, one, or two topics that you’re going to subscribe to, right? So these are topics in your MQTT stream. It’ll just subscribe to the one you want. This one will actually subscribe to all sensor data, but you can subscribe to specific sensor data that’s in your MQTT stream and define whether you want a persistent session. If you want a persistent session, it just basically means if the subscriber, in this case, Telegraf, is offline, these messages will be delivered when it comes back. So this is the way to ensure that you don’t lose data should your Telegraf instance be offline or unable to communicate with the MQTT broker, right? Give it a client ID, and if you’re using a username password for authenticating to your MQTT broker, you’ll put that in here, and we’re going to use the InfluxData format. And that’s really all you have to do to get Telegraf to start pulling data out of your MQTT broker and putting it into InfluxDB. Really simple and straightforward to get this done.
David Simmons 13:08.560 There are a lot of other options for the MQTT broker plugin for Telegraf. One is if you want to enable SSL for secure communications. I always recommend using secure communications just because secure is best and especially in IoT. You can also use SSL, but you can skip the chain and host verification in that to speed things up a little bit. And there’s a URL there that will give you a list of all the different data formats supported by Telegraf so that you can choose your data format that you want. Again, you’ll have the ability to have session persistence and to provide a client ID. If you don’t provide a client ID, one will be randomly generated for you. So again, very simple to setup, very fast to setup. It’s that fast time to value, how quickly you can set this up, begin bringing data into your InfluxDB instance, and start seeing results. And speaking of results, here’s a sample of a dashboard that I built for some IoT data that I have been collecting. I have been collecting humidity, pressure, temperature, and light. And there’s three different light values that I’ve been collecting, and I’ll point out this data on pressure. I was running this last Friday afternoon as a severe thunderstorm came through and just as I started collecting this data, you’ll notice this precipitous drop in pressure. Well, it turns out after I finished building this dashboard, and took this picture, and went downstairs, and turned on the news, a tornado went by about three miles from here. And that account right at this time, and that accounts for that huge drop in pressure. So I thought that was pretty cool. So I’m going to give you a quick demo of the dashboards and things like that. So let me share a different screen. Can you see that?
Chris Churilo 15:47.243 Yep. It’s a little bit small though, can you make it bigger? You must be on a really big monitor.
David Simmons 15:56.208 Is that better?
Chris Churilo 15:58.316 Just a one more.
David Simmons 16:01.926 Tell you what. Any better?
Chris Churilo 16:08.384 Yeah, that should be fine.
David Simmons 16:10.196 Okay. So this is actually real-time data that is streaming right now. The sensors were offline for this little bit here, but this is real sensor data that is being streamed from my desk here. I have a little sensor on my desk that is streaming this data. And I just covered up the light sensor, and you can see those light values over here dropping, right? So the bright yellow line is the lux, the purple line is infrared, and the brown line is broadband or visible light. And by covering it up, I just dropped it all down to near zero. And I have taken my hand away, and you can see them begin to come back up. Now one of the interesting things that I learned by using this dashboard the other day, and I don’t have the—I’m only showing the last 15 minutes of data here. But I was actually looking at this data when I first began running this dashboard over the period of about an hour. And what I noticed was my temperature data would go up to right about close to 80 degrees, and then it would drop down to 74. And it kept doing that over and over, and I was thinking, “Now why is it doing that?” And I’m sitting here watching as the temperature crawled back up to about 78, and then suddenly my air conditioning kicked on. And the temperature began to drop.
David Simmons 17:57.417 So I really got this insight from my data to something that I wasn’t actually measuring. I wasn’t measuring when my air conditioner came on and went off. But it turns out I got a really good view of when my air conditioning was coming on and off by looking at other data that actually wasn’t meant to show that. And that’s the kind of thing you can get by being able to visualize your data and see what’s actually going on. You may find things in your data that you didn’t know were there, that you weren’t measuring but could be important. And that’s what I call these hidden gems in your data. I didn’t know it was there, but now I can suddenly monitor when my air conditioning unit comes on and off just based on the temperature data that I was already collecting, right?
David Simmons 19:10.920 All right. Now let’s see if I can go back to…
David Simmons 19:42.137 And that’s where I sort of [inaudible]. It’s amazing what you can discover when you can actually see your data. You can find things in your data that you didn’t know were there. You can find things that you didn’t even know you were able to measure or that you weren’t measuring by looking at your data. And that’s one of the most powerful things about this, being able to visualize your data. And I will also say that setting this entire demo up that’s collecting all this data took me about three hours to write the software for the sensors, and about 20 minutes to get the dashboards up and running. I mean, that includes getting the entire TICK Stack installed on a server and just start collecting data. So it took me well under half a day to begin to collect all this data, and write a sensor that could collect data, send the data to the InfluxDB instance, scanned it all up, bring it all up, get meaningful dashboards, and begin to actually see value in less than half a day. So I think that’s a really powerful story for using InfluxData in the TICK Stack to begin to get value very quickly out of your IoT deployments.
Chris Churilo 21:10.040 And David, you’re actually relatively new to InfluxData as well. So this isn’t something that you’ve been doing for years, and years, and years.
David Simmons 21:17.733 No. I’ve been doing the IoT side for close to 15 years, but it’s been a month and four days that I’ve been at InfluxData. So it’s not like I’m an expert at InfluxDB, and Telegraf, and Chronograf, and all the other parts of this. I’m by no means not an expert, but I’m able to bring this up and start getting value out of this that quickly. And I think that also speaks to the power of this platform for IoT deployments.
Chris Churilo 21:57.961 So I imagine you must have collected similar kinds of data before since you’ve been working with IoT devices for many years. Maybe you can draw some comparisons to some other tools or some other ways that you have done it that really shows the power of InfluxData.
David Simmons 22:13.775 Well, that’s actually one of the things that drew me to InfluxData so strongly—was I’ve been doing this for a very long time. And what you do with the data coming out of your IoT deployment has always been the single biggest sticking point of getting an actual IoT deployment up and running, all right? Getting the IoT sensors built, and the data collected, and even sending it to an empty MQTT instance—something like that is fairly straightforward and takes very little time. But what you with the data after that has always been a sort of well, we can put it into MySQL, or we can put it into PostgreSQL, or we can—but then, what do you do with it because those don’t really perform very well when you start shoving that much—if I’m shoving a billion rows of data into something and then trying to get some meaningful dashboard out of it, that was always the part where most IoT people start to mumble, right? Well, after that, we’ll mumble mumble. And they wave their hands and walk away because there wasn’t really such a great answer to that question of where do we put this data and how do we access it in a meaningful way. And really, that’s what I like about the InfluxData stack is that it gives a very clear, very useful answer to that question of, great, we’re collecting all this data and now, what do we do with it? And not only is it a clear answer to that, but it’s a simple answer, and it’s not a complicated answer. It takes very little time to get from here’s where we put the data to here’s the usefulness of that data and now we can begin to look at and query that data and take actions on it.
Chris Churilo 24:16.634 That’s really great to hear. And if anybody has any questions for David, please put your questions in the Q and A panel, or the chat panel, or if you want to speak to him directly, just let me know, and I can take you off of mute. So the other thing to remind everybody is that we are an open source software which means that you can go to our website or to GitHub and download the software for free. So you can actually try it out at no charge and actually see for yourself how easy it is to get this thing set up. Another thing to let everybody know is that David’s going to be on the road quite a bit pretty soon at a number of different IoT meetups presenting this, so you can actually get a chance to meet him in person, ask more questions, and actually take a closer look at some of the work that he’s been doing. So, David, what are some of the meetups that are coming up?
David Simmons 25:15.802 Well, first of all, I’ll be at ThingMonk next week in London. So if you happen to be in London and attending the ThingMonk conference, come find me there. The 14th, I will be at the Cincinnati IoT Meetup in Cincinnati, Ohio. On the 19th, currently, I’m scheduled to be in Detroit at the Detroit IoT Meetup. The week after that, the week on the 24th, but I could have the date wrong, I will be at Austin IoT in Austin, Texas. And on the week after that, I believe it’s October 2nd, I will be at the Atlanta IoT Meetup. And finally, the last one which was just scheduled last night, on November 7th, I will be speaking at the Charlotte, North Carolina IoT Meetup.
Chris Churilo 26:12.226 That’s great. So if you guys have other local events that you’d like David to attend so you can see what he’s been doing, just feel free to drop us a line, and we’ll try to get that scheduled or definitely get you connected to David. So as I mentioned, if you have any questions, put it in the chat or Q and A panel. We’d love to answer—well, David would love to answer any of your questions that you might have as it pertains to the demo today, or even just some general questions about InfluxData, all the components of the TICK Stack. Feel free, we’ll keep the lines open probably for the next five minutes.