Using InfluxDB as a Modern Process Historian
By Sam Dillard / Oct 21, 2021 / InfluxDB, Community, Developer, IIoT
It can be difficult to facilitate interconnectivity within an Industrial Internet of Things (IIoT) environment, especially considering the different needs of the business and IT. OPC, or Open Platform Communication, allows for connectivity of data and monitoring across devices. Implementing OPC can solve significant IoT interconnectivity challenges; however, bridging the gap between the OT side of OPC and the business and IT side of the enterprise isn’t always easy or straightforward, especially when you don’t have the interest, time, human resources, or budget to implement a traditional process historian and its business application integrations.
ThingWorx Kepware is a leading OPC server and gateway application that connects legacy industrial assets with applications, databases, and cloud services. A product from PTC, a trusted Influx Data partner since 2018, ThingWorx Kepware allows you to stream your OPC tag data directly into InfluxDB. Gone are the days of your process historian’s antiquated architecture and user interface. Collecting and storing tag data, at scale, with ThingWorx Kepware and InfluxDB is easy, fast, and inexpensive to implement.
Modern Industrial IoT (IIoT) and Industry 4.0 applications are typically built around one or more messaging or data exchange platforms. Here’s a quick summary the three most common types:
- OPC: OPC includes OPC-DA, OPC-UA, and oftentimes OPC-AE. These are most common on the plant floor and within an industrial organization's data center.
- MQTT: MQTT is less specific to OT than OPC but provides low-latency methods to pass data between assets and applications that fit well in a hybrid asset/application network.
- RESTful APIs: RESTful APIs are a web-first technology designed for application-to-application or client-server interactions. RESTful APIs fit well within edge-to-cloud architectures or those built and managed by IT or web developers.
Fortunately, ThingWorx Kepware and InfluxDB support all of these methods, providing flexibility and support for your chosen integration.
To Telegraf, or not to Telegraf?
Telegraf is a plugin-driven agent that collects, processes, aggregates, and writes metrics. It supports four categories of plugins including input, output, aggregator, and processor. It is built by the InfluxData team and community, is available as open source, and can help streamline your integration process.
One of the first considerations in building a data pipeline to store metrics in InfluxDB is whether the message content is structured, semi-structured, or unstructured. For those not familiar with those terms:
Structured: Dependent on a data model, structured data is commonly represented as tables and rows, where each row and column represents a known semantic and format. A good example would be a csv, where there is a column for each ISO 8601 timestamp, another for each normalized OPC tag name, and a column for a numeric value representing the tag at that point in time.
Semi-structured: This is structured data which can’t always be expressed in tabular format, and may represent relationships in a hierarchical manner. Consider JSON or XML - the structure and content is flexible, but the rules for interpreting information are usually encoded in the format - field names, stanza groups, etc. A semi-structured representation of OPC data could include a JSON doc which groups tag names and values under a single timestamp and server name.
Unstructured: While this bucket also contains data types like images, pdf docs, and more, for our purposes, unstructured data is simply data where the format and content is not predefined. MQTT message (msg) bodies aren’t restricted in content and therefore can carry any format of text or binary information. Unstructured data requires schema modeling - either before it is stored (thus converting it to structured) or after it is stored via parsing - for example regular expression field extraction.
Ultimately, before you insert PTC Kepware tags into InfluxDB, you will need to transform your data into Line Protocol (LP). Line Protocol is a highly efficient text format, much like CSV, that is optimized for the data parsing that goes on within InfluxDB. Everything you need to understand about Line Protocol is right here in our docs: https://docs.influxdata.com/influxdb/v1.8/write_protocols/line_protocol_tutorial/
How and where you convert to line protocol is an architectural decision. There are opportunities to format unstructured MQTT or semi-structured OPC within PTC Kepware, or in an external solution like Flow Director or HighByte, but it is highly recommended that you consider implementing a Telegraf gateway as it provides a method to centralize data collection, processing, reformatting, and can even provide a pathway to write out to other services or back to applications like an MQTT broker. The Influxdb_v2 Output Plugin in Telegraf will handle a lot of the connection configuration to both InfluxDB Cloud and InfluxDB 2 which can be a big benefit when it comes to managing HTTPS connections. Telegraf can almost always run on the same machine as your PTC Kepware Kepserver or your MQTT broker, or on an edge gateway product available from many industrial software and hardware vendors.
Whether you use Telegraf or not (again, we REALLY recommend you do), the following methods are available to PTC Kepware users to store tag histories in InfluxDB, and make those histories available to users and applications:
Use the Kepware IoT Gateway Advanced Plugin to stream industrial data directly to an MQTT broker of your choice or one from an InfluxData partner like HiveMQ or Flow Director. The Telegraf MQTT Consumer Plugin will monitor your MQTT topics and automatically store the data sent to those topics by Kepware. You can make it even more powerful by using InfluxDB tasks to process that data and write it back to the broker for Kepware to integrate with other downstream operational applications.
Using the Kepware IoT Gateway Advanced Plugin, you can convert your tag data into line protocol and send to InfluxDB’s write endpoint on change, poll interval, or per tag-side logic. This approach also allows you to enrich the tag data with InfluxDB Tags and Fields, which will make all future analytics and visualizations in InfluxDB super-powered. Note - sending data directly to an InfluxData write endpoint with the PTC Kepware REST Client is experimental, especially when sending from an on-premises KepserverEX instance to InfluxDB Cloud. This is one of the scenarios where Telegraf will really make a difference - you can write directly to the InfluxDB V2 Listener Input Plugin and relay to InfluxDB Cloud with an extra level of reliability, scalability, and security.
Kepware OPC Server
You also have the option to connect directly to Kepware’s OPC server with the Telegraf OPC UA Client Input Plugin. This is the best option for specific advanced users who need to leverage the OPC namespacing within InfluxDB and are willing to take the additional time to configure and manage the OPC connection within Telegraf.
Are you ready?
If you are already a PTC Kepware user, you can download Telegraf, install it in a location with secure access to the KepserverEX application, and use the documentation linked above to configure your preferred integration pathway. We’ve also created an InfluxDB Community Slack channel to discuss tips, tricks and ask questions of the InfluxDB IoT team and our amazing community of Influxers. Looking forward to seeing you there!