You’ve heard the hype, the Internet of Things (IoT) is going to connect more people to devices, more devices to the Internet and generate more data than any major IT shift in history. IoT is going to be bigger than the web, mobile and the cloud, right? It’s still too early to tell for sure, but at InfluxData we are helping startups and enterprises everyday bring an interconnected world closer to reality.
What does time-series have to do with IoT? Everything, actually. Sensors and devices used in IoT architectures emit time-series data, and a lot of it.
Why are companies building IoT and sensor data solutions?
Whether it’s pH and humidity readings from an agri-sensor, depth and fluid readings from a geo-sensor or voltage and temperature from a power control sensor, these metrics are forming the basis of intelligent businesses. Common use cases we run across are:
- Agro industries are monitoring and trying to control environmental conditions for optimal plant growth.
- Power and utility companies are building smart solutions to reduce resource wastage for residential and commercial customers.
- Research labs and heavy industries are tracking the resources, usage and health of millions of tiny valves and instruments that go into their massive production plants, factories and manufacturing facilities.
- Smart cars are now powerful computers making runtime decisions based on data collected by 100s of sensors on every vehicle.
Challenges in building IoT and sensor data solutions
The key challenges organizations face while building an IoT solution are:
- Bandwidth – As sensors are generally deployed on-premise and need to communicate over wireless networks, bandwidth constraints prevent sending large packets of data in real-time
- Horsepower – Compute power on sensors are generally limited. Hence analytics software – programs or databases or even processing logic needs to have a tiny footprint.
- Concurrency – In case of industrial IoT, number of sensors could easily range in 100s of 1000s, each transmitting metrics every minute or so. Anticipating backend database’s concurrency limits is crucial in the design of such solutions
- Protocol – As this space is rapidly evolving, there aren’t any definitive standards for communication protocols. MQTT, AMQPP, CoAP etc are being used based on use cases. Hence IoT analytics solutions need to support many communication protocols.
- Scale – Data retention, compression and visualization has it’s own challenges in such a large data footprint solution. Businesses want to plot trends (WoW, MoM, YoY) and aggregation of massive data sets can be very compute heavy.
InfluxData for IoT and Sensor Data
InfluxData helps companies manage the large volume of time-series data that IoT sensors emit at scale with targeted features like:
- Integration with sensor specific data collections libraries like Sensu
- Client API libraries in Go, Perl, Node.js, Java, .Net, Ruby, Scala, Haskell, Python, Lisp, R, SNMP. These libraries allow for custom instrumentation of any app running on the sensor and lets it transmit streaming metrics to InfluxDB, natively or via HTTP API
- Scalable enough to support 100s of 1000s of sensors transmitting data points at high frequency
- Support for ARM, Arduino, Raspberry PI and other IoT form factors
- Highly scalable data processing layer for pattern matching, alerting and anomaly detection
Collecting IoT & Sensor Data
A large scale IoT deployments means having to collect data from disparate systems, applications, datasources, services and infrastructure components. InfluxData’s Telegraf collector supports 30+ inputs and 10+ outputs and can be easily extended to support your sources of data. Telegraf makes collecting data in a format InfluxDB can consume, simple. Here’s why:
- MIT License
- Minimal memory footprint
- Extensible plugin design with 40+ input and output plugins
- Support for datasources like MongoDB, MySQL and Redis
- Messaging systems like Apache Kafka and RabbitMQ
- Third party APIs like Mailchimp, AWS CloudWatch and Google Analytics
- Collects system metrics like CPU, Memory, I/O, etc
However, the InfluxData platform is extensible by design so you can easily integrate other collection agents like collectd, in conjunction with Telegraf.
Learn more about Telegraf
Storing IoT and Sensor Data
The most popular data type in any IoT and sensor data system is going to be in a time-series format. InfluxDB is designed from the ground up to handle just time-series data and to do it better than any other database. InfluxDB is the “I” in the TICK stack. More specifically, InfluxDB is an open source database written in Go to handle time-series data with high availability and high performance requirements. InfluxDB installs in minutes without external dependencies, yet is flexible and scalable enough for complex deployments. Here’s why InfluxDB is the best choice for storing a custom monitoring solution’s time-series data:
- MIT License
- Simple to install, yet highly extensible
- Purpose built for time-series data, no special schema design or custom app logic required
- Thousands of writes per second with the new TSM1 storage engine
- Horizontal clustering for high availability in active development
- A native HTTP API means no server side code to manage
- Time centric functions and an easy to use SQL-like query language
- Data can be tagged, allowing very flexible querying
- Answer queries in realtime with every data point indexed as it comes in and immediately available in less than 100ms
Learn more about InfluxDB
Visualizing IoT and Sensor Data
If you don’t already have a dashboarding or graphing UI in place, InfluxData provides Chronograf. It’s the “C” in the TICK stack. Chronograf is a downloadable binary you install behind your firewall to collaboratively, yet securely, perform ad-hoc visualizations on your time-series data. Features include:
- Simple installation and configuration
- Tight integration with InfluxDB making getting connected to data easy
- Support for ad-hoc visualizations
- Smart query builder designed to work with large datasets
- Collecting multiple graphs into dashboards
- Templating, new graph types and visualizations coming!
Another visualization UI choice that offers tight integration with InfluxDB is the open source Grafana project. Either choice makes connecting to and visualizing time-series data, simple.
Learn more about Chronograf
Learn more about InfluxDB & Grafana
Processing IoT and Sensor Data
Inevitably, you are going to want to either alert on or in some way process the time-series data in your IoT deployment. You’ll want to do this either before it gets written to InfluxDB or when it is retrieved. To address this need, the InfluxData platform ships with the open source Kapacitor project. Kapacitor is the “K” in the the TICK stack. It’s an alerting and data processing engine specifically designed for time-series data. It lets you define your own custom pipeline to aggregate, select, transform or otherwise process data and then store it back in InfluxDB or trigger an event. Features include:
- MIT licensed
- Stream data from InfluxDB or query from InfluxDB
- Trigger events/alerts based on complex or dynamic criteria
- Perform any transformation currently possible in InfluxQL, for example: SUM, MIN, MAX, etc.
- Store transformed data back into InfluxDB
- Process historical data, for example: backfill data using a processing pipeline
Learn more about Kapacitor
InfluxData is used for IoT and sensor data by startups and large enterprises alike. Visit our Testimonials page for a comprehensive list.