How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quality with Node-RED and InfluxDB
Session date: Mar 28, 2023 08:00am (Pacific Time)
American Metal Processing Company (“AMP”) is the US’ largest commercial rotary heat treat facility with customers in the automotive, construction, military, and agriculture industries. They use their atmosphere-protected rotary retort furnaces to provide their clients with three primary hardening services: neutral hardening (quench and temper), carburizing, and carbonitriding.
This furnace style ensures consistent, uniform heat treatment process vs. traditional batch-or-belt-style furnaces; excels at processing high volumes of smaller parts with tight tolerances; and improves the strength and toughness of plain carbon steels. Discover why AMP’s use of Telegraf, InfluxDB, Node-RED, and Grafana allows them to gain 24/7 insights into their plant operations and metallurgical results. Learn how they use time-stamped data to gain accurate metrics about their consumables usage, furnace profiles, and machine status.
Join this webinar as Grant Pinkos dives into:
- American Metal Processing’s approach to heat treating in a digitized environment through connected systems
- Their approach to collecting and measuring sensor data to enable predictive maintenance and improve product quality
- Why they need a time series database for managing and analyzing vast amounts of time-stamped data
Watch the Webinar
Watch the webinar “How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quality with Node-RED and InfluxDB” by filling out the form and clicking on the Watch Webinar button on the right. This will open the recording.
[et_pb_toggle _builder_version=”3.17.6” title=”Transcript” title_font_size=”26” border_width_all=”0px” border_width_bottom=”1px” module_class=”transcript-toggle” closed_toggle_background_color=”rgba(255,255,255,0)”]
Here is an unedited transcript of the webinar “How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quality with Node-RED and InfluxDB”. This is provided for those who prefer to read than watch the webinar. Please note that the transcript is raw. We apologize for any transcribing errors.
- Caitlin Croft: Sr. Manager, Customer and Community Marketing, InfluxData
- Grant Pinkos: President, American Metal Processing
Caitlin Croft: 00:00:00.831 Hello, everyone, and welcome to today’s webinar. I’m very excited to be joined today by Grant, who will be talking about how American Metal Processing uses InfluxDB. Once again, my name is Caitlin Croft. Please post any questions you may have in the Q&A or the chat. We’ll answer all of them at the end. And without further ado, I’m going to hand things off to Grant.
Grant Pinkos: 00:00:29.320 Okay. Thank you, Caitlin. Good morning, good afternoon, and good evening to everybody. I’m from American Metal Processing, and I’m here to talk today about how we use InfluxDB. So let’s go ahead and get into it. Okay. Hold on. Trying to advance. There we go. So American Metal Processing was founded in 1945. So just 77 years ago, 78 now. We’re located near Detroit, Michigan, and we are the largest commercial rotary heat treat business in North America. Commercial, meaning other companies that choose not to do their heat treating send it to us. We serve customers in a variety of industries. We’re open 24/7, so round-the-clock data logging. And we only close on major US holidays. As you can see from the photo, we’re less than 20 employees. I think our current headcount is around 18. That’s me right there. I’ve always been a data junkie who was fascinated for the last 20 years with data logging products and software and tools. But I never really found any of them to be particularly engaging.
Grant Pinkos: 00:01:45.728 About three or four years ago, I started developing homegrown solutions using the software and hardware from a company in California called Opto 22. They produce excellent hardware and software. And that quickly led me to discover Node-RED. And that led me to discover InfluxDB and Grafana. So we have been using InfluxDB for about three years, and I have managed to pipe in our data from the previous systems into Influx. So we have roughly 12 years of data now. And like Caitlin said, I’m active in the forums too with the InfluxDB forum, the Grafana forum, the Node-RED forum and the Opto 22 forum. Those are kind of my four go-to’s every day. The people there are very, very helpful, and I try to be helpful too. Okay.
Grant Pinkos: 00:02:42.649 So I’m just going to cover in maybe 60 seconds or a minute, what is heat treating? So heat treating is basically any process that uses a controlled heating and cooling to modify the structure of metals and metal alloys. You’re changing the physical and mechanical properties without altering the shape. There’s probably two dozen types of main heat treating. There’s annealing, normalizing, ferritic nitrocarburizing, goes on and on. But at American Metal Processing, we focus on generally just two. The first one is called through hardening. It involves heating the material to a temperature that transforms its internal structure without melting it. You hold it at that critical temperature for a period of time, and then you rapidly cool it or quench it. And let’s put this in terms that everybody can understand. How about a piece of toast. It’s completely hard throughout. It has the same properties on the inside and the outside.
Grant Pinkos: 00:03:46.273 The other process that we do is a family called case hardening. So it could be carburizing or carbonitriding, but both are known as case hardening. And it basically involves hardening the surface of the metal part while allowing the metal underneath to remain soft. So again, put it in terms we understand. How about French bread. It’s soft on the inside, and it has a nice hard layer on the outside. Okay. So that’s what we do. Our furnace, like I said in the beginning, is called a rotary retort furnace. This is a very large 80-meter long furnace from start to finish. It is a continuous furnace. So that cylindrical object you see in the middle is what’s called the retort. It’s very slowly rotating. And inside of it are our flights in an inverted helix that slowly conveys the parts through the whole furnace through the different flights. And eventually when they get to the last flight, they fall through a discharge hole and into the quench tank. And then they ride up the conveyor. So it’s a continuous process from heating through the cooling.
Grant Pinkos: 00:05:01.178 By contrast, a batch furnace looks something like this. It’s basically like a large dishwasher, if you could think of it that way. It’s got large baskets or trays. You put parts into them. You send them in. It heats them up, keeps them there, and then you quench them in the tank. And that’s the red box in that picture. A belt furnace is also very common, but we do not use it. It basically looks like what you would see at your local pizzeria, except much larger. But it gradually feeds the parts onto the belt. The parts are generally just mounted up onto the belt or laying flat. So the heat is only entering them from one side, whereas in the retort furnace, that’s uniformly heating them. So just a big difference there.
Grant Pinkos: 00:05:48.831 So let’s talk about the furnace itself. It is about 80 — I said 80 meters. 80 feet, 25 meters long. It’s got a feed system. It has a washer. It has a retort, which is the main part that’s put up the high temperature. And then the quench tank. So in the picture, you can only see the conveyor coming out of the ground because the actual quench tank is below ground. And the quench tank is large. It’s about 6,000 gallons of oil or water, depending on the furnace. Okay. So not shown on this sketch is the burner system. That’s those yellow piping that you see on the outside there, gas flows, which go into the furnace that send the gas. That’s not shown as well. And the quench pumps and the cooling system. You can’t really see anything in the sketch, but there are quench pumps that control that fluid flow. Okay.
Grant Pinkos: 00:06:50.952 So let’s talk about how we kind of digitized our heat treating equipment. So at the outset, we went to our team members and kind of had various team discussions. And we asked, does everybody in the organization embrace the value of data? The answer should be a resounding yes. It truly has to be a company-wide effort to make sure that everybody sees value in that. Each person in each department is going to probably look at the data differently. Some are going to care more about the parameters that affect the quality. Certainly, the operational parameters of heat treating. But the maintenance crew might look at totally different things. And the finance people may look at other things. And so those three kind of general areas are what we would focus on. We asked, is everyone willing to invest the time to learn what they do not know, and train new people as they come on board? And again, I think it should be understood that this is not something that just one or two people can do, but everybody can do. Ask yourselves, is there a cost ceiling or a budget that you have to follow when you’re implementing the sensors and databases and software? There’s a lot of choices out there. Some are cheap. Some are expensive. But be sure to look carefully at that. Ask yourself, what does everyone in the organization wish they could know which they do not? The answers may surprise you when you ask that question.
Grant Pinkos: 00:08:31.984 So let’s look at our furnace. There it is. And in general, we kind of start with the basics, right? We want to monitor the temperature. That’s critical to the process. It tells us a lot of things, and it’s probably the very first data point that we set up. Second one is the quench tank temperature. If your quench tank is too hot, then you’re not going to get results. So it’s important that that temperature be maintained. The speed of the moving retort matters a lot because it’s important to make sure that the parts are moving through for the right amount of time and not too long. Next is the moving conveyor. That needs to be monitored. We also care not just about one temperature but probably three temperatures in the main heating chamber. The feeder system also needs to be controlled. It can’t be sending too few or too many pounds per minute or per hour. Quench pump system. And there’s several pumps usually that have to be running at all times and make sure that they’re running operationally correct. They’re sending that hot water or hot oil to a heat exchanger. That heat exchanger has inputs and outputs that we want to monitor as well.
Grant Pinkos: 00:09:58.169 We also have gas flows inside of the furnace that also affect the product quality directly. There’s another piece of equipment called an endothermic gas generator. That sends the gas into the retort, but it produces the gas right there on site. So in there, we’ve got several gas and air flows, and we have a blower motor. And to do it right, that temperature has to be held very tightly to produce that endothermic gas. So you can see, all told, were upwards of 25 or 30 sensors. And these are just some of them. There’s actually more, but these are the main ones. So when we talk about the signals that we’re sending, we use Modbus extensively. It’s found on a lot of old controllers, going back to the ’70s and ’80s, that it’s very easy, once you understand it, to get the data out of a Modbus controller and send it through Node-RED and send it into Influx — 4 to 20 milliamp signals are also very common. We see those on a lot of pieces of equipment, including older ones. We have certain items like scales that send their data out via RS-232. All three of those things will almost always flow into Node-RED or collected by Telegraf or something.
Grant Pinkos: 00:11:24.186 MQTT has been very, very helpful. A shoutout to Anthony for setting us up with that because it’s been a game changer. Once you understand MQTT, it’s just wonderful to get the data flowing around that way and capture it and send it into Influx. Certain devices, like compressors, might have an ethernet jack right on them. So you can query them directly and get the information. And again, Telegraf, like Node-RED, can be installed just about anywhere, and it can be used to get the data that you want. And it’s all going into Influx. So just to walk you through one of the signal chains, this is temperatures and pressure sensors. So we usually use type R or type K thermocouples, pressure switches, and digital pressure gauges. Just different things like that. We send them into a temperature or pressure collection device. This could be a Modbus-enabled temperature controller, Opto 22 devices. Their products can take just about any signal under the sun and do something with it. So a PLC, if you have one, perhaps it can take those signals. IoT devices, PC, Raspberry Pi, and the list goes on and on.
Grant Pinkos: 00:12:48.431 We take that data, and we will process it through Telegraf or Node-RED to perhaps pull the data every 30 seconds or every minute or every 15 minutes. And then we use the same two tools, Telegraf and Node-RED, to send it into Influx every Y seconds. I could say every 30 seconds or something. Sometimes X and Y are the same, but sometimes they differ a little bit because we’re doing something with the data maybe outside of Influx that we want to display it or whatever. Okay. So this signal chain is very common on our furnaces for all the temperatures and pressures that we’re monitoring. Certainly, the quench tank temperatures, the generator, the heat exchangers, and things like that. Okay.
Grant Pinkos: 00:13:37.981 The next one signal chain that I want to show is the one we use for motors, variable frequency drives, and compressors. This would be the examples you see on the left, like just the clamps that can monitor the current flowing through a wire. We can also grab maybe a temperature comparer, a variable frequency drive, that has Modbus capabilities. We can also grab, like I said, a compressor that maybe has an ethernet signal or an RS-232. Again, take that data, send it into a collection device, process it using Telegraf or Node-RED. Maybe we have to, not downsample it, but manipulate it a little bit. In the case of a variable frequency drive that’s being pulled via Modbus, there’s really good examples of where we can pull the parameters or the registers on the drive. And, in one fell swoop, we can get the speed of the motor, how much current is being drawn, the direction of the rotation. Sometimes we’re moving the retort backwards and forwards. So that’s important. The temperature of the drive, we can get that and we can get any alarm codes. And using Node-RED, we and convert those to plain English so that it doesn’t say error code 645 or something. It’ll tell us it’s over temperature or current overdrawn or something like that.
Grant Pinkos: 00:15:11.540 And again, then finally we send it into Influx. Okay. So again, this is used on conveyor systems. Basically, anywhere you have a motor. Pumps, vibratory feeders, which are kind of on and off devices by varying the current, scales, blowers, things like that. Okay. So back to our sketch of the furnace, that, like I said, it can contain up to 30 data points. So it’s a lot, but that’s not where it begins or ends. So we have, not one furnace line, a single line, what we call it, but we have two single lines. So they’re identical twins. We also have another single line that has a pre-wash in front of it. So that would be a lot right there. But we actually have a double furnace line where it’s got one large chamber with two revolving retorts, two pre-washers, two feeder systems, two conveyors. And each can be controlled independently. And we have another double furnace. So that’s a total of seven rotary furnace lines. That’s why we are the largest one in the country. Because just one of these is tough to operate, but to have seven of them is a lot. Okay.
Grant Pinkos: 00:16:23.933 We also have five endothermic gas generators. And so each one is operating independently of the other, sending the gas to each furnace. And again, the composition of the gas is critical to ensuring that the quality of the product comes out perfect. Okay. So all of that constitutes what we call the connected plant. So it basically comes down to one thing, which is that everybody has access to real-time operational data. It shouldn’t be something that’s hidden away in the IT room or is only available to three people in the company, or the plant manager’s desk or someplace like that. It seems like all too often, data is siloed. And we don’t believe in that in any way. So we permeate the data through everywhere in the organization, so.
Grant Pinkos: 00:17:23.162 How do we do that? We use kind of tried and true visualization tools. The first one is InfluxDB’s regular dashboarding tools. They work well. They’re easy to configure, and we’ve got several of those in place. We use Grafana extensively throughout the company. It works just fantastic on mobile devices, on large screen, small screens. It’s got just an unlimited amount of customizability. I just absolutely love Grafana. groov View, which is the product offered by Opto 22, is also fantastic. It’s easy to use, easy to get started with, easy to build your own dashboards or operator panels or whatever it is that you want to use. It can display the data very easily using Node-RED coming out of Influx, or to use to send data into Influx. And again, it’s just one of the great tools out there. And Node-RED dashboarding, which is kind of their legacy product, but also the UI Builder, which we have limited experience with. Those dashboarding tools are also quite easy to get started with, and we use those in different places.
Grant Pinkos: 00:18:51.872 So I’ll try to show some examples here of just a couple of things that we have in place. The first is just a feeder system. That’s the computer that’s controlling the scale and the vibration of the parts to go into the furnace. So it’s basically monitoring kind of just your typical parameters that you would see on a computer system. So the RAM and the temperature and things like that. How much processes are being used, uptime, things like that. These are two Grafana dashboards. Two different ones. The one on the left is as viewed on a mobile phone. And it basically shows the different zones of the furnace. Shows the feed rate. You can see it’s got the sparkline in there so you can see if it’s trending up or trending down. Every operator has this on their phone, so they can quickly look at the data from three hours ago, yesterday, last shift, whatever it was. So extremely helpful. And then in the large panel on the right, it’s what I had mentioned earlier about querying, using Modbus, the parameters that are coming out of a drive. So in this case, the retort is moving 62 seconds forward, and then 48 seconds in reverse or something like that.
Grant Pinkos: 00:20:16.274 And so it’s one thing to go into the Allen Bradley drive and scroll through parameter B152 and try to see how much current it was drawing. But this is just so much easier to be able to liberate that data out of the Allen Bradley drive and put it onto Influx and into Grafana. So you can see it. You can go back. You can see, okay, the current is peaking at 3 amps or something like that. Drive temperature, maybe it goes up in the evening, down at night. Who knows? But it’s so easy to build these dashboards using Influx and Grafana. I mean, why wouldn’t you? It just seems so antiquated to go into an old or a proprietary piece of equipment, like a variable frequency drive, and not see all that data that’s labeled to be seen.
Grant Pinkos: 00:21:10.458 The next two dashboards or screens, I guess, are from two different companies. So the one on the left is from Opto 22. This is using groov View. We’re able to import our own images. So that’s why we’ve got a picture of our furnace in there. We’re able to query our SQL database which stores the work orders and that feed rates and recipe parameters. But then we can use this same with the start and stop buttons to start the loader. We can monitor how much weight is on the scale, how much pounds per hour have been fed. All sorts of things. We can trend it on the graph you see down below. We can pipe it in our camera system so that we can see a real time video of what’s happening at that particular furnace. So a lot of possibilities here. On the right, you see a Node-RED dashboard, where I’m basically, again, querying our SQL database in the top section, operating the feeder in the middle. And in the bottom, we’re actually using Iframes to insert our Grafana dashboards, and again, our time selectors there as well. Okay.
Grant Pinkos: 00:22:21.802 So it’s just important that when you kind of look at it holistically, that Influx is kind of providing the insight into four areas, right? Maintenance activities, which affects the maintenance planning. The product quality and the relationship between the quality and the process data. The costs of running the business and controlling them. And the monitoring of that data and alerting it. Okay. So we’re going to walk through a couple of examples that kind of give you some perspective there. So when it comes to maintenance, there’s kind of four general levels. There’s reactive, preventive, analytical, and predictive. Reactive is where you’re running around fixing stuff as soon as it breaks. That’s not something we or any manufacturing business wants to be in that position. Preventive, preventative, which is typically calendar-based. Change something every Monday. Refill this every third Monday of the second month or something like that. You could certainly set up calendar reminders to do things, but it’s not a real data-driven way to do things. So we’re trying to get away from that and move into these two, which are the conditions-based maintenance and eventually a predictive model base.
Grant Pinkos: 00:23:50.294 So we do that using Influx. We’re basically trying to pump as much of our processed data into Influx so that we can start to identify the trends and parameters that would dictate when we do certain conditions-based maintenance. So I’ll just walk you through a few examples. This is an example of how does the thermal effectiveness of a heat exchanger change over time. If this example looks familiar to some, it’s because it was something that somebody at Influx, Anais, created several years ago and published on the blog. It’s very useful. It’s easy to understand. And essentially, what it does is it measures your temperatures going in and out of the hot side and the cold side of that heat exchanger. Putting them through various formulas and joins, and calculating out the thermal effectiveness. And when you do that, you don’t want to check it every minute necessarily, but you could check it once a week or once a day. And you would see over time that the thermal effectiveness of the heat exchanger degrades. And that means it’s time to take it apart and clean it. And so it’s a simple example, but it’s really effective. You don’t have to guess. You know pretty clearly where it was a month ago or a week ago versus today. And it’s a process to take these down, and this gives you a real good perspective of how your heat exchanger is performing over time and doing that way.
Grant Pinkos: 00:25:28.780 Another simple example that can be used for conditions-based maintenance would be measuring the pressure drop across a filter and determining how long it takes to become clogged. So this would be just measuring the pressure at the inlet of the filter and the pressure at the outlet, computing the pressure drop, and then creating a Grafana alert, for example, to detect when that pressure drop, it becomes greater than a certain value, whatever number of psi it is that indicates the filter is clogged. So again, you’re not changing the filter once a week. You’re changing it when it needs to be changed. So it’s some better perspective, I think, on that. I created a bunch of examples a few months ago, and they’re published at the link you can see there. But these are all using Flux queries to create Grafana alerts. And, again, Grafana’s examples — or Grafana’s tools are amazing for doing this sort of thing, for creating very, very complex multi-dimensional alerts. So you can monitor seven pieces of equipment using one general alert, and the alert will detect when equipment number three or equipment number six is out of compliance, and sends you an alert and does all that. So some good examples there if you want to check them out.
Grant Pinkos: 00:26:57.611 So when it comes to cost and consumption data, in accounting, accountants will use a term called standard cost, which is basically where you’re trying to guess what the cost to produce the good or service is. And at the end of the quarter, you look at your actual costs, and you basically make a journal entry that corrects your estimate, and you post it to a variance. But that model is because you don’t have good enough data in order to know what it truly costs your business. So we’re trying to get away from all of these things and pump all of that data into Influx. So I’ll show you a few examples of what I mean. So if we ask ourselves, what is our electrical usage when we are idling our always-on equipment? I plotted here a snapshot from November. And you can see very easily, it’s the kilowatt hours that we’re querying directly from our utility provider via their website, which is using, I don’t know, some sort of XML data. But we use Node-RED to log into that website every morning, grab the hourly data, post it into Influx. And so with that, you’re able to see visually that the energy or the electrical uses drops to about 135 kilowatt hours or so when we’re just idling, when the building is dark, lights are off, nobody’s there, but we’re still running fans and maybe quench pumps and maybe one or two other things just to sort of keep things from freezing or seizing up or something.
Grant Pinkos: 00:28:40.755 So just to do that, just to keep the business in its idle state, you can compute the cost because you know how much kilowatt hours you’re using, and you know the cost, so it’s easy to see. So again, a simple example, but it’s easy to visualize once you have the data. You can also figure out how much it takes, electricity-wise, to ramp down your equipment and how much and how long it takes to ramp up. So it’s easy to do using just a simple Flux query. Another example is if I ask how much more is our electrical usage on the hottest days versus the coldest days? So because we’re monitoring all of our equipment in Influx, we can run a query that compares apples to apples. So I wouldn’t want to compare it against, let’s say, December 23rd, which was maybe the coldest day of the year, because we weren’t running any equipment on December 23rd. But on November 19th And June 21st, the same pieces of equipment were running. So I am comparing apples to apples. And the only thing that changed was the outside temperature. So I can see that it was 96 degrees outside in June and it was 22 degrees in November. And because of that, I can then look at the electrical usage for the day and see that it was about, what, 1.7 kilowatt hour — or megawatt hours more than what it was in the coldest day.
Grant Pinkos: 00:30:17.080 So, intuitively, I know that this is due to our process chiller, which is a large unit outside of our building that basically has compressors in there that chills water from, I think, 90 degrees down to 70 degrees or whatever. And that has to work harder during the hot days in the summer, draw more electricity. And in the winter, you’re benefiting with the cold temperatures outside, so you don’t have to do that. Okay. So just to sort of summarize kind of the lessons learned, is that everyone in our organization or any organization has to understand and appreciate the value of the data. I think it’s even better if people are excited about it. And I think that most are. It certainly is easy enough to plaster your business with 55-inch monitors, where you can show the data updated real-time for everybody to understand and appreciate and use in their day-to-day work. And again, the parameters that you measure are going to affect the quality, the maintenance, and the cost equations. I’d also just emphasize that you take the time to plan logically your buckets, your measurements, your field names, your tag names. In the very beginning, I kind of ran right in and didn’t understand all of the terminology. And as a result, I had to start over a little bit later. I was still able to capture the data and convert it from the original field in text structure to the new one. And I did that using Node-RED by the way, so there’s a flow out there to do that. But in any event, it’s just important to logically organize that data according to buckets, measurements, field, and tags.
Grant Pinkos: 00:32:11.718 Invest in high-quality sensors and devices that are suitable for your environment. Our environment is hot, and it’s somewhat dirty. So the equipment that we buy needs to be robust and holding up to that environment. I said create signal chains which can be easily diagnosed. I think Telegraf and Node-RED especially are really easy to look at the data that’s coming in. And if something stops flowing, what’s the reason? Where in the chain did the data stop flowing? It’s easy to see, again, using Node-RED, and is one of the go-to’s for that. Write meaningful queries and alerts that answer the question of what users wish they could know. The queries that we continue to build week after week continue to blow my mind. We do a lot with elapsed time to show when something was last done, or how many hours or days or minutes have been logged on a certain piece of equipment. So as we do those, I think the users become more and more engaged. This is what I said earlier about putting the data everywhere in your plant, in your office, on your phones. The tools that are out there make this really easy and really quite cheap to do. And I think it brings the value out in the company when you’re diagnosing your problems and you’re troubleshooting. You’re not hiding this data away. You’re using it, and its becomes a huge part of everyone’s day-to-day activity. Okay. So that’s it. I will turn it back to Caitlin for Q&A.
Caitlin Croft: 00:33:59.302 Awesome. Great job, Grant. There are a ton of questions for you. So let’s get started. Can temperature be measured remotely with pro-quality infrared cameras?
Grant Pinkos: 00:34:12.728 Oh, that’s a good question. So we don’t have any infrared cameras. But I believe that the ones I’ve seen do have some sort of output capability. It’s probably like RS-232 or something. And I would think that, yes, if you had it on a wireless signal or something, you could walk around and capture temperature data using an infrared camera and send it right through Node-RED into Influx. So I don’t know enough because we’re not on one, but I would think the answer is yes.
Caitlin Croft: 00:34:45.264 Can Telegraf API be implemented on edge devices based on constrained microcontroller? What are the advantages to using Telegraf instead of MQTT?
Grant Pinkos: 00:34:56.937 So my preferred way to do this is to send everything through MQTT. Receive or subscribe to the MQTT broker in Node-RED. And then parse out the data that you want, or the fields, the tag names, or whatever it is that’s coming across the MQTT broker. And parse it before sending it into Influx. You can do it using Telegraf plugins, but I find it more challenging than using Node-RED. Now maybe that’s just my comfort level. It can change over time. Sometimes I will use Telegraf because it’s an out-of-the-box, easy-to-set-up solution. But for me, I would just skip the Telegraf and have it go straight from MQTT to Node-RED.
Caitlin Croft: 00:35:50.913 Cool. Are you doing any stream processing using Node-RED IR or any other service?
Grant Pinkos: 00:36:01.384 I think, by stream processing, meaning Grafana live data, like just streaming? So we do have the MQTT broker capturing or streaming various parameters. And using the Grafana MQTT data source plugin, you can subscribe to a given topic and have it display live right there on Grafana. So it’s not being stored anywhere, but you are displaying it live. We do that in a few cases, but by and large, we store all the data in Influx and view it that way. But we have done both.
Caitlin Croft: 00:36:43.629 Did you try any wireless sensors like Wi-Fi, LoRa, or BLE instead of RS-232 or Modbus?
Grant Pinkos: 00:36:52.554 Yeah, good questions. So I’m very intrigued by the — I think it’s LoRa or LoRaWan parameters or specification, whatever it is. I found a company. It was through Opto 22 because there was a forum thread on there talking about just that, about wireless sensors. And I did contact a company — I think they were based out of Vietnam — that produces wireless sensors for this. Typically, just operate on a battery and send data every couple of minutes. You can put them out in the field and have them turn on once a day or something like that. But at this point, we’re largely just ethernet or two-signal wire going everywhere in our plant. Only a handful of wireless things and choosing a wireless Raspberry Pi or something that’s just set up next to a piece of equipment that’s grabbing data or something like that. So unfortunately, I have not used a lot of wireless sensors to date.
Caitlin Croft: 00:37:59.781 Are you using Grafana Cloud or on-prem?
Grant Pinkos: 00:38:03.470 Both. So with Grafana Cloud, that’s what we got started with. And we still have it set up and we’re using it. But with the OSS, because you have the ability to go into the config file and open up to allow for unsigned plugins — so in our case, we wanted video feeds going into Grafana from our local cameras. So to do that, I had to use the OSS version. And then there’s other things that are available only on the OSS, mostly in the customization fields. So if you’re kind of the plain vanilla user that just wants to display beautiful graphs and user alerts and things like that, I think Grafana Cloud is an excellent choice. For us, we wanted to go a little bit deeper where we wanted to exchange data in and out a little bit more. And so for that reason, we ended up putting the OSS on-prem as well, so.
Caitlin Croft: 00:39:06.052 What additional software did you have to use to allow InfluxDB dashboards on mobile devices?
Grant Pinkos: 00:39:13.976 So we don’t. We don’t use InfluxDB dashboards on mobile devices, or at least I’ve never tried it. I don’t know what it looks like. But my belief is that it’s not set up to render nicely like Grafana’s dashboards are. So I don’t know if I actually — I don’t know who said that. But no, we only use Opto 22’s groov View on a mobile device, and we use Grafana on a mobile device.
Caitlin Croft: 00:39:41.085 Have you experienced any slow loading in Grafana when you use Flux Light? Is it deployed in InfluxDB Enterprise, OSS, or Cloud?
Grant Pinkos: 00:39:51.414 So it’s deployed an Influx OSS. And no, I experience no delays. And oftentimes, we’ll go back and do queries on three years of data sampled every 30 seconds or something. And I’ll go across multiple parameters. And it’s always just within seconds to get it back. So I’ve never experienced any performance issues using Flux or anything like that. And, yeah, it’s been great. Now we also have InfluxDB cloud running IOx. So I just now, kind of tinkering with the Flight SQL plugin that was just released by the Influx team, and trying to see what our data looks like when it’s viewed in IOx or IOx. So nothing to really report there yet, but a couple of months from now, we’ll definitely be into it.
Caitlin Croft: 00:40:51.614 Glad to hear that you’re already using InfluxDB Cloud powered by IOx. There are community office hours this Wednesday talking about using Flight SQL. So I will find a link and share that in the chat. Next question, and there’s a lot more. You’re very popular, Grant. Do you use any — do you only use absolute values of parameters collected by sensors or derivatives or other kinds of post processing, like filtering PID controllers or DFT, etc.?
Grant Pinkos: 00:41:28.499 That was a long question. Can you repeat it one more time? Sorry.
Caitlin Croft: 00:41:32.211 Sure. Do you only use absolute values of parameters collected by sensors or derivatives or other kinds of post processing, like filtering PID controllers or DFT?
Grant Pinkos: 00:41:49.346 So we don’t use just absolute values. We do a lot where we’re comparing an actual to a set point and doing a calculation on that. We can also look at a rate of change. So like in the case of quench temperature, okay, the threshold may be 70 degrees, that when it hits 70 degrees, you have to stop the furnace or whatever. But if the furnace temperature — if the quench temperature is rising at, let’s say, 10 degrees every 5 minutes or something, that tells you that you’re going to have a problem in a matter of minutes. So we will use Influx, and specifically Flux queries to determine things like rate of change or maybe — all the customized functions that are available for Flux, we go and explore. So I don’t know if that answers the question, but we will definitely not just query on an absolute value or alarm on a single value.
Caitlin Croft: 00:42:58.596 Do you have any experience with vibration sensors or even audio signals collected from different parts of the system?
Grant Pinkos: 00:43:06.268 No. And most of our equipment is not really needs to be monitored for vibration. But I’m familiar with, because I see them all the time when I’m looking for other sensors, that there’s a lot of vibration sensors out there. But unfortunately, I don’t have any experience with that.
Caitlin Croft: 00:43:26.198 How tightly is Grafana integrated with InfluxDB? And can other frontends be easily used instead of Grafana? I mean, I can answer that. It’s pretty easy to use other frontends. A lot of our customers will actually build their own UI. And we have a lot of customers that have built their own custom UI and have InfluxDB running in the backend. A lot of people do use the InfluxDB UI, but a lot of people also love Grafana. And there’s a bunch of other visualization tools. So especially, if you start using InfluxDB Cloud powered by IOx, another really good one is Apache Superset is another visualization tool that we’re starting to see people use. Let’s see. Someone says fantastic presentations. So great job. Are you going through an MQTT broker then through Node-RED or directly from IoT devices through Node-RED into InfluxDB?
Grant Pinkos: 00:44:26.949 So the MQTT broker that we use is just Mosquitto. It’s used on-site. And what we’re doing is subscribing to — or various devices are publishing using Node-RED to the MQTT broker. And then we pick that up on other devices that maybe need to use that data. So for example, the temperatures on one piece of equipment may be transmitted to the MQTT broker, but another piece of equipment that needs that temperature data on the other side of the plant would just subscribe to the MQTT topic. So that’s sort of the core use of MQTT. But then along the way, we got the MQTT data source plug in, which allowed us to do just ephemeral streaming of the data and get it into Grafana without a database at all. It’s more limited. You can’t do alerting on it, and you can’t do, I think, as many transformations. But it’s a nice, quick visualization. It almost looks like a hospital too because it’s just data flowing across the charts. And so you can get instantly updated information. It updates every two seconds or three seconds. So we use that, so.
Caitlin Croft: 00:45:46.405 Did you evaluate using cloud providers like AWS EC2 or similar Azure or Google Cloud offerings? How can this be compared with using InfluxDB Cloud? I will say this, EC2 is for longer term storage. InfluxDB is a purpose-built time series database. So oftentimes, customers will send their data to InfluxDB for analysis, and then they send it to AWS EC2 for long-term storage. And you can buy InfluxDB on the cloud marketplaces. So on AWS, Azure, and Google marketplaces. Grant, is there anything else that you’d like to add?
Grant Pinkos: 00:46:27.188 Yeah. In the very beginning, we did use EC2 and put our Influx instance on there. And it worked fine. I’m not a huge fan of Amazon products because I find them a little bit difficult to administer from security rules and how you can get data in and out of there. Maybe it’s gotten better. This was three or four years ago. So I just prefer the on-prem version of Influx rather than hosting the OSS version on EC2, so.
Caitlin Croft: 00:47:03.064 Hi, Grant. Ben from Opto 22 here. Great presentation. In day-to-day use, how does your staff manage looking at three different dashboards that you have? Are there any thoughts or plans to consolidate into just one?
Grant Pinkos: 00:47:19.354 So yeah. Good question, Ben. So every furnace or every piece of equipment in the building has its own, let’s say, furnace-specific dashboard. So this is a 24-inch monitor that you can stand out at the furnace and see, okay, here’s the feed rate, here’s the temperature, here’s the quench, here’s the gas, whatever. And so you’re looking at things on a time graph of just that furnace. Now, we have seven furnaces, right, or seven lines. So to consolidate that, we put it all on a 55-inch board where it just shows what we consider the most critical parameters of a given furnace. And we actually lay it out in Grafana so that it mimics the floor layout. So it looks like a map of our building, if you will, but it’s got blocks that change color or the values in the box. It’s a bunch of stat panels basically in Grafana. So that’s the one where — kind of the main shift supervisor task that someone can stand out and look at and get a good overview of the entire plant. Everything should be green, but if something turns yellow or red, they know that they have to go over there and do that. And we have alerts set up to do that as well. So even if he’s not standing there, he’ll hear over the speaker, “Generator 303, temperature alarm.” And by the way, Ben, that was you that helped me do that. It was a text to speech node. So when we get an alert, it doesn’t just fire off a signal. It goes — it actually has a voice over the PA system that says what the alert is so that you know exactly where to go and what to look for. And you can do it in multiple languages and everything. So just having a text to speech node on your alerts is great.
Caitlin Croft: 00:49:14.470 How is the predictive model based maintenance done? What sort of models are you using?
Grant Pinkos: 00:49:22.111 Yeah, that’s a good question. So we’re still in our infant stages with this. What we’re doing is we’re looking at parts in the side of the furnace that get degraded over, let’s just say, weeks or months. And what we’re trying to do — because it’s inside the furnace, you can’t see it. You just want to know when to pull it out and change it out. So we’re just now at the point where we’re gathering our data to look at something after 35 days or 65 days or 80 days or whatever, to assess the condition. And if we look at it, let’s say it’s a piece of alloy pipe or something that sends the gas, and after 65 days, it’s 50% degraded or something like that, then we can put the parameter of, okay, this many days with this much gas at this temperature led to that condition. So that’s the best way I can think of that we can move to a predictive model. But I think as time goes on, as we get more and more data in the system, we should be able to develop more sophisticated models. But we’re still in our infancy stages there.
Caitlin Croft: 00:50:31.111 Have you considered doing any automations based on the data?
Grant Pinkos: 00:50:37.101 Any automations. Oh, yeah, like if we were — like if the pre-wash stopped or something. So that’s where all of Opto 22 stuff comes in. So because Opto 22’s devices will be monitoring, let’s just say, the movement of the retort. When the retort stops, if it stopped for any reason, you don’t want to keep feeding parts in, right? So it would send a signal right there to stop the feeder immediately. So even an operator pressed pause or something on the retort, but forgot to stop the loading system from feeding parts in, it would be intelligent enough to do that. So yeah, we’ve developed that level of automation. But maybe there’s probably many more ideas down the road that we can pursue.
Caitlin Croft: 00:51:27.967 What are the main advantages InfluxDB has over Postgres + TimescaleDB? I will say this—
Grant Pinkos: 00:51:36.669 Yeah. Yeah, go ahead. Go ahead.
Caitlin Croft: 00:51:37.590 Oh, before you jump in Grant, I was just going to say InfluxDB is purpose-built for time series data. We don’t have any external dependencies. So we’re just a lot faster for collecting timestamp data, which is important because often, when you start off with time series data, you’re collecting data at maybe even the nanosecond precision because you don’t know where those interesting data points are. So that’s why we say we’re purpose-built because we’re not built on top of anything like. Timescale is built on top of Postgres. And Grant, for you as a community member, what have you seen as the advantages?
Grant Pinkos: 00:52:14.740 Yeah. So, well, prior to even knowing about Influx in the 2015 era, we were using, just for a period of time there, Microsoft SQL as a time series database, I guess, just pumping our data in there. It was not the right tool for the job, I think. That’s why when I finally discovered it was a time series database was, I said, “This is the magic. This is where we can store millions or even billions of data points all with a unique timestamp and be able to query data in and out, so.” So how it compares with Postgres, I can’t say because I have limited experience with Postgres. And same with the other one that was mentioned, Timescale Database. I don’t I don’t have any experience with that one either.
Caitlin Croft: 00:53:09.554 Can you explain the — sort of on the same line. But can you explain the benefit of using InfluxDB over a relational or document database?
Grant Pinkos: 00:53:19.250 So yeah, that’s kind of my example of what we use with Microsoft SQL. So we have a lot of experience with SQL because we use it in our business for customers, part numbers, work orders, that sort of thing. So we knew the way it works with a relational database. But when it came to pumping in processed data, it seems — and there’s experts on this that can answer it much better than I am. But basically, a relational database is structured in the way the engines work and whatnot. I don’t think it lends the same fast querying ability and things like that. And especially when you get into tags and cardinality and stuff like that. I think that it just became obvious. Honestly, it was so long ago, I don’t remember the specifics. But I was like, okay, this is not the right tool for the job, so.
Caitlin Croft: 00:54:18.605 Yeah, it’s interesting. I’ll be honest, you can throw timestamp data, time series data, at a relational database, at a document database, but it just won’t handle it as well. So as I mentioned previously, a lot of times when you start off with a time series project, you’re dealing with a ton of data. You don’t know where the interesting points are. If you’re looking at an industrial machine, you don’t know if you need the data every second, every five seconds, every nanosecond. And so oftentimes, people will collect it at the nanosecond precision at that kind of granularity to begin with. And then over time, they realize that the interesting points, maybe they’re only every five seconds. So maybe you don’t need it at such granular level. So then that’s where things like downsampling in Flux come in really handy. But that’s why you would want something that’s purpose built for time series data to be able to handle the really high ingestion. We do have a lot of people who either try to build our own database or use an existing tool to store the time series data. And it just gets unwieldy pretty fast. So that’s why they turned to us. Let’s see, I’m trying to understand the functionality difference between InfluxDB and Grafana. I think Grafana is for visualization. But you mentioned dashboards in InfluxDB as well. Grant, do you want me to answer that, or do you want to take it?
Grant Pinkos: 00:55:49.364 I’ll try, and then you can correct me if I say something wrong. So yeah, Influx does have their own dashboarding tools. They’re, like I said, easy to set up and use. There’s a bunch of pre-built ones that may or may not be useful. But Grafana is purely visualization. No data storage whatsoever. It can be piped in with 16 different types of data sources and cross-compared. So you can have Microsoft SQL, Graphite, Prometheus, and Influx all displaying on one Grafana graph if you wanted to, and doing math and stuff. So yeah, Grafana, purely visualization, and Influx, for us, at least, mainly for storage.
Caitlin Croft: 00:56:35.752 Yeah. Absolutely. There is a visualization capability in InfluxDB. That being said, with the launch of InfluxDB Cloud powered by IOx, we are kind of simplifying the visualization capabilities and really making sure that the database itself is super robust, and encouraging people to use things like Grafana and Apache Superset for visualization. Or even, as I said before, creating your own custom UI. All right. We’re completely over time. Grant, do you have a few more minutes?
Grant Pinkos: 00:57:13.837 Yeah, I can keep going. Yeah.
Caitlin Croft: 00:57:15.637 Okay. Because I don’t know if you noticed, but there is a bunch of questions here. So there are a few questions around how you handle security and also how you manage security in Node-RED, MQTT, and InfluxDB.
Grant Pinkos: 00:57:32.183 Yeah. So with Node-RED, if you just go to their forum, there’s a whole bunch of steps that one needs to take to protect Node-RED from the outside world, from hacking and whatnot. And I just installed Node-RED on a device, I think, last weekend. And it’s a different installer now, where it asks you a whole series of questions about security. On the Opto 22 stuff, where Node-RED is running, it’s always bolted down. It’s protected. I’ve never doubted the security there. It’s only when I’m running Node-RED on another device that I would care about that. And then as far as our Influx and Grafana, it’s on-prem, but we do have redundant databases set up so that it’s sending data to two different machines. Or the cron script is backing up the data every 15 minutes or moved to an S3 storage bucket or just different things like that so that we’re not caught in a vulnerable position.
Caitlin Croft: 00:58:38.356 Perfect. All right. Apologies. I’m just kind of going through the questions here. So how are you optimizing your time series cardinality?
Grant Pinkos: 00:58:50.852 Yeah. So, for us, it goes back to that original planning exercise, where when we came up with the number of possible tags and tag names, that’s where we said, okay, are we at 20 tags, 30 tags, 50 tags, 120, 1000 tags? And at the time, Influx put out pretty good guidance about the limited nature of the cardinality and stuff like that. Now, I know with IOx, it’s changing. So, for us, it’s never been an issue, meaning the way we came up with our tag names and the number of series and everything like that, it was a good planning. Honestly, it probably was two or three months of just planning on paper. Me and others writing out on a whiteboard, okay, how do we want to plan this data and how do we want it to be organized? And kind of doing it that way, so.
Caitlin Croft: 00:59:46.613 Yeah. And if someone’s concerned about cardinality, definitely have a look at InfluxDB Cloud powered by IOx, which Grant just mentioned. One of the biggest features that it brings to our community is unlimited cardinality, which I know is a really exciting thing for our community and what people have been asking for, for a long time. Let’s see. Do you know how many tags you’re monitoring?
Grant Pinkos: 01:00:19.290 I can probably give a good estimate. Probably upwards of 50 or 75 different tags. Something like that.
Caitlin Croft: 01:00:32.220 Just a few.
Grant Pinkos: 01:00:34.052 Yeah. Yeah.
Caitlin Croft: 01:00:37.529 Do you have any redundancy in your InfluxDB setup?
Grant Pinkos: 01:00:43.333 Yeah. So a friend of mine, Toby, I met him, I think, through Opto 22. But he has a similar setup. And what they’ve done is they just have — I think they use Mac Minis or something like that. But they have Influx running on three or four Mac Minis at once. And so it’s a RAID configuration, if you will, sort of. So one can fail and the other three will keep going. We don’t have that yet. Eventually, we’re probably going to have a similar setup. Maybe not using Mac Minis, but some sort of — probably, it’ll adopt the best practices for sending data live to a cloud server and then also an on-prem server that gets backed up, and, again, sent off site and just kind of doing that.
Caitlin Croft: 01:01:35.982 Cool. How many new data points do you have coming in from every sensor? Do you have an idea?
Grant Pinkos: 01:01:44.544 Well, like I said, it’s about 30 data points on a given furnace. And there’s seven lines. So that’s just the heat treating furnaces. That’s over 200. And then five generators. Each one of those has maybe 8 or 10. So you’re pushing already about 3 to 4 hundred sensor readings every minute. Sometimes we query every 10 seconds or every 30 seconds, but usually every minute is enough for what we’re doing. So I don’t know if that answers that question.
Caitlin Croft: 01:02:17.069 Cool. Let’s see. What quality of servers do you use for MQTT publish and subscription?
Grant Pinkos: 01:02:26.877 Was the question what quality of servers? Is that what they were asking?
Caitlin Croft: 01:02:29.242 Quality of servers.
Grant Pinkos: 01:02:29.736 Oh, quality of servers. Yeah. Yeah. So I forget the numbers, is zero, a one, or a two. And Terry explains this really well in one of his videos. But we use the level two, which has the — I think it’s called The Last Will and Testament, or something, of the message. So the message goes across, and then it comes back, and the other devices — yes, it was delivered or something like that.
Caitlin Croft: 01:02:54.675 Were there any instances where you could have avoided a major breakdown or bad quality product using the data captured?
Grant Pinkos: 01:03:03.501 Yes. So it wasn’t so much the using the data captured. It was knowing when the data was bad. So first of all, we have windstorms and snowstorms and ice storms and things like that that will knock out power. And the building is running 24/7, right? So when power goes out, there’s all sorts of safety switches and things like that. The latch closed. And the only way they can be opened is with an operator. You can’t operate it from a software perspective and say, open the switch or whatever. So there’s a fire code or whatever out there that requires an operator to manually change something over. So the thing is that when this happens and you don’t have anybody in the building, if it did happen while you were closed, you only have maybe 25 or 30 minutes to react, to keep whatever equipment it was that was running or something like that. So for those reasons, yeah, we’ve been able to react very swiftly to an event that was otherwise undetectable, unless you add some other security system in the plant to detect when power was being lost. But we get alerts from Influx or Grafana that show us the minute the data stops flowing. And in the case of a whole furnace, it could be something really important.
Caitlin Croft: 01:04:32.683 So there’s a couple of people asking about the recording for this. It is being recorded. And you basically just have to go to the webinar registration page tomorrow morning, and you’ll be able to watch the replay. What are you using for alerting? Are you using Kapacitor, or are you using something else?
Grant Pinkos: 01:04:53.205 No. Well, Grafana sends the notification to Slack. We’ve used email in the past, but once we moved to Slack, it just became much better. There’s a whole thread in the Grafana world on how to customize your Slack alerts. So we can put in links that allow you to silence the alert. Because it’s a Slack alert, you can all contribute or chime in, let’s say, “This was a false alarm,” or, “Jim has taken care of this,” or whatever. So there’s all sorts of benefits, I think, to using Slack.
Caitlin Croft: 01:05:30.205 Awesome. Thank you, Grant. So I apologize we didn’t make it quite through everyone’s questions. I just want to be super cognizant of time. If anyone has any questions that they really want to ask Grant, everyone should have my email. I’m happy to put you in contact with him. Thank you, everyone, for joining today’s session. I think it was clearly very valuable. Grant, I hope we didn’t tire you out with all those questions.
Grant Pinkos: 01:05:56.048 No, it’s good. Thank you. Thank you.
Caitlin Croft: 01:05:58.159 Thank you. And thank you, everyone, for joining today’s webinar. Once again, it will be available for replay tomorrow morning. And be sure to check out, if you’re interested to learn more about Flight SQL, check out our community office hours. I put the YouTube link in the chat. So our amazing DevRels, will be on there. And holler if you guys have any questions. If you need any help with InfluxDB, I’m always happy to put you in contact with someone on our team or if you want to go bug Grant with more questions. I know he’s in our Slack, and also I can put you in contact by email. Thank you, everyone, and I hope you have a great day.
Grant Pinkos: 01:06:45.720 Thanks, Caitlin.
Caitlin Croft: 01:06:47.158 Thanks. Bye.
Grant Pinkos: 01:06:47.925 All right. Bye.
President, American Metal Processing
Grant is the President of American Metal Processing located near Detroit, Michigan. He is a chemical engineer and MBA by trade and enjoys IIoT and Industry 4.0. He has held roles in process engineering, finance, and management for the past 25 years. As a firm believer in the saying "If you can't measure it, you can't manage it", he became interested in Time Series databases and found InfluxDB in 2020.