Using External Data Sources to Enhance Your IoT Data

Navigate to:

Digital twin 369x245 0 I write a lot about IoT data, and what to do with it. (Hint: Store it in InfluxDB, of course!) But there’s more to making your IoT data useful than just storing it, analyzing it, and visualizing it—which is so far what I’ve been writing about. Sometimes it’s helpful to augment your IoT data with other sources of data either to help validate your data, or to give a larger context to your data. If you’re building a ‘Digital Twin’ augmenting your IoT data with external data is critical to success. I’ll walk through the basics of what a Digital Twin is, and then we’ll jump right into how to augment your IoT data with data from external sources.

The Digital Twin

You may have heard of the concept of the Digital Twin before, or maybe you haven’t. The idea is fairly simple: You create a digital representation of a physical ‘thing’ that you can make changes to, adjust, etc. to see the effects on performance, etc. Let’s say you’re a maker of Jet Engines. Making ‘adjustments’ to your engines in the real world might not be the best idea. So you build a digital model of the jet engine, and run it in virtual space, where you can make adjustments, etc. Up to now, this has been an exercise of limited value since often the digital model and the physical model differed greatly.

Enter the world of IoT where the physical jet engine can feed data directly into the digital model to make it more accurate, and capable of giving real-time, real-world results. A model is only as good as the data that you give it. So how can you make your digital jet engine even better? Give it more data! Bring in data from not just the sensors in your real-world jet engines, but bring in all the parameters you can. Load the data sheets and specifications for each of the thousands of parts in the engine. Now you can see how each part is performing against its specifications. Add weather data, atmospheric data, etc. to see how your engine performs under different conditions. The possibilities are nearly limitless.

Augmenting Your IoT Data

This whole thing started with the IoT Demo I built a couple of months ago. As you might recall, I began sampling some environmental data—humidity, pressure, temperature, etc.—and storing it in InfluxDB Cloud. These sensors are now scattered throughout the company. Some are in our headquarters, but the majority of them are scattered around at various employees’ home offices, including mine. The data is interesting, and we’ve learned a few things from it along the way. But we really didn’t have anything to compare the data to.

Then last week, I built a quick demo of how to track Hurricane data from the National Hurricane Center using InfluxDB. And then that got me thinking about what other things we could do with publicly-available real-time datasets. It turns out there are a lot of them, and they contain some really good data. So I went looking some more, and found that I could get current weather readings from any of thousands of National Weather Service observation stations. How cool is that? So I started hacking again.

The first thing I needed were the GPS coordinates of my current location. Thanks to Google Maps, this is easy to come by. I then sent this to the National Weather Service to get a list of nearby observation stations:

  $ curl https://api.weather.gov/points/35.6589,-78.7859/stations "observationStations": [ "https://api.weather.gov/stations/KRDU",     "https://api.weather.gov/stations/KTTA",     "https://api.weather.gov/stations/KHRJ",     "https://api.weather.gov/stations/KJNX",     "https://api.weather.gov/stations/KIGX",     "https://api.weather.gov/stations/KPOB",     "https://api.weather.gov/stations/KFBG",     "https://api.weather.gov/stations/KSCR",     "https://api.weather.gov/stations/KTDF", ... ]

The entire output actually gives you an indication of how close the station is to your location, so I picked the one that was, and kept hacking.

  $ curl https://api.weather.gov/stations/KRDU/observations ... "presentWeather": [ { "intensity": null, "modifier": null, "weather": "rain", "rawString": "RA" } ], "temperature": { "value": 20.6, "unitCode": "unit:degC", "qualityControl": "qc:V" }, "dewpoint": { "value": 19.999993896484, "unitCode": "unit:degC", "qualityControl": "qc:V" }, "windDirection": { "value": 0, "unitCode": "unit:degree_(angle)", "qualityControl": "qc:V" }, "windSpeed": { "value": 0, "unitCode": "unit:m_s-1", "qualityControl": "qc:V" }, "windGust": { "value": null, "unitCode": "unit:m_s-1", "qualityControl": "qc:Z" }, "barometricPressure": { "value": 100640, "unitCode": "unit:Pa", "qualityControl": "qc:V" }, ...

A pretty straightforward JSON, so it was just a matter of parsing it, and then shipping it off to my InfluxDB Cloud instance.

var myMsg = {};
    myMsg.payload = [];
    myMsg.payload[0] = {};
    myMsg.payload[1] = {};
    const station = msg.payload.properties.station.split("/")[4];
    switch (station) {
        case "KSLC":
            myMsg.payload[1].location = "SaltLakeCityUT";
            break;
        case "KRDU":
            myMsg.payload[1].location = "RaleighNC";
            break;
        case "KBED":
            myMsg.payload[1].location = "FraminghamMA";
            break;
        case "KSFO":
            myMsg.payload[1].location = "SFMktg";
            break;
        default:
            myMsg.payload[1].location = "Unknown"; 
    }
myMsg.payload[0].x_temp_c=parseFloat(msg.payload.properties.temperature.value);
myMsg.payload[0].x_temp_f=parseFloat(msg.payload.properties.temperature.value * 9/5 + 32);
    myMsg.payload[0].x_pressure=parseFloat(msg.payload.properties.barometricPressure.value * .01);
    myMsg.payload[0].x_humidity=parseFloat(msg.payload.properties.relativeHumidity.value);
    return myMsg;

See? Really simple! So far I’m only collecting external data from 4 observation stations, and correlating it via tags in the database, with the existing sensors. And now I have a new dashboard where I can see the differential between the real-time data coming off my sensor (which is indoors) and the observed data from the NWS— which I presume is outdoors.

SafariScreenSnapz049

It’s pretty plain to see that my Air Conditioner is broken. I have to admit that I was a little surprised by the difference in atmospheric pressure though. I’m still puzzled as to how the pressure is lower in my house than it is outdoors, but it could also be simply that the observation station is a few miles away, and there is a pressure differential over that distance.

What's Next?

There’s always something next, isn’t there? I’ve been talking to a few of the engineers about writing a Telegraf plugin for the National Weather Service observation stations so that you could simply define, in your telegraf.conf file, the observation station, and the parameters in the JSON you want to monitor, and have it do the polling of the data, formatting, and inserting into your InfluxDB instance.

I’d love to hear from you, either here or via my twitter feed, of what you’d like to see in future blog posts, or in the way of other feeds of publicly available data feeds.