How Factry.IO Built a Data Historian that Changed a Water Treatment Company's Business Model
Session date: Sep 08, 2020 08:00am (Pacific Time)
The Factry.IO Data Historian uses a number of industry-specific protocols and systems such as OPC-UA to collect data from industrial control systems and store them in InfluxDB. This allows them to process time series data from various sources and turn that into valuable insights for their process industry customers.
In this webinar, Frederik Van Leeckwyck, Co-Founder and Business Development manager at Factry.IO, will share how these insights are accessible to all staff members and clients at a company that markets water treatment installations, replacing a spreadsheet-based approach. The company took this a step further and now utilises the collected data to underpin its transition to a “Water as a Service” business model.
Watch the Webinar
Watch the webinar “How Factry.IO Built a Data Historian that Changed a Water Treatment Company’s Business Model” by filling out the form and clicking on the Watch Webinar button on the right. This will open the recording.
[et_pb_toggle _builder_version=”3.17.6” title=”Transcript” title_font_size=”26” border_width_all=”0px” border_width_bottom=”1px” module_class=”transcript-toggle” closed_toggle_background_color=”rgba(255,255,255,0)”]
Here is an unedited transcript of the webinar “How Factry.IO Built a Data Historian that Changed a Water Treatment Company’s Business Model”“How Factry.io Built a Data Historian that Changed a Water Treatment Company’s Business Model”. This is provided for those who prefer to read than watch the webinar. Please note that the transcript is raw. We apologize for any transcribing errors.
- Frederik Van Leeckwyck: Co-Founder and Business Development Manager, Factry.IO
- Chris Churilo: Director Product Marketing, InfluxData
Frederik Van Leeckwyck: 00:00:06.176 All right. Chris, thank you very much. Thanks for the intro. Thanks for the opportunity for us to present. And today, I’d like to take you all through a bit of a story with Factry Historian, how we helped change Ekopak’s business model. This is a bit of a busy slide if you look at it, but in the background, you see one of the Ekopak water treatment installations in construction, so you’ll get to know that during this presentation. A bit about me, I’m the co-founder of Factry together with two of my colleagues. And at the company, I’m responsible for everything related to business development. I have a degree in bioscience engineering, which doesn’t have a lot to do with IT, but still, it’s very useful, definitely, in this context. And for Ekopak, I was responsible initially to do the onboarding. So, I helped them with the initial value discovery and value creation, yeah, in their context using the technologies that we’ll be talking about today. And Ekopak is actually the first client that we helped achieve things with data based on Flux. So already in May 2019, we were experimenting with the language to get useful insights out of their data. And I’ll be sharing those as well today. And if I’m not working, I enjoy spending time outdoors as you can see in that picture.
Frederik Van Leeckwyck: 00:01:53.591 So the table of contents for today looks like this. We’ll start off with an introduction about Ekopak; what do they do and why do they do what they do. Then I’m going to introduce you Factry as a company, and then we’ll dive into, first of all, the situation. So, what’s the business context in which Ekopak is operating, what challenges they face, and why they face these challenges now. And then the main part will be where we’ll go through iterative value creation together with them based on the data we started collecting with Factry Historian and InfluxDB. And we’ll close off everything with some learnings. So first off, we’ll start with Ekopak. So, their mission is to make industrial water management more sustainable while drastically lowering production costs. I think we’ve all heard news somewhere that water is becoming a scarce resource, and definitely, in the industrial context, you can expect that a lot of water is used. So, companies like Ekopak, they help make this use of water more sustainable and make that cheaper as well. And as you can see, this is about industrial water management. So, we’re really talking about an industrial setting not really the municipal water purification we have, for example. This is really about industrial water management, and I’ll explain further what that means.
Frederik Van Leeckwyck: 00:03:33.337 So in their case, Ekopak more or less markets three things, chemicals, disinfection equipment, and processing equipment, which is underlined there. And this equipment is typically built and shipped in these containers as you can see on the right. So, these containers, they contain the water treatment equipment which is finally sent off to clients, and they’re based in Tielt in West Flanders in Belgium. So why do you need water treatments? You might know that water contains or can have different properties, and in order to use it in an industrial process, you need to tune the characteristics for it to be usable. A classic example is, for example, on the left you see cooling and boiling. You might recognize this from when you’re cooking water at home, for example, in a water cooker or a coffee machine, you need to descale these things once in a while. So that’s because there are minerals in the water. And a company like Ekopak, for example, helps industrial companies to get those minerals out of the water.
Frederik Van Leeckwyck: 00:04:47.533 In the middle, you see that Ekopak helps to tune the characteristics of processed water. So processed water is water that is used in industrial processes for mass or heat transfer. So, the heat transfer could be steam or cooling water and must transfer there, the water is being used as a solvent. So ideally, you’re going to have to tune the characteristics of the water for it to not have scaling, for it to be a good solvent for a certain chemical you’re using in the process, or it could even become an ingredient if you’re talking about breweries, for example, where the water quality is of big importance. And then on the last, they also help with the ultra-filtration equipment that even allows you to remove bacteria and viruses from the water. So, what does the company do then? And this is another picture of one of these containers with the water treatment equipment built in. Ekopak does the design of these water treatment installations. They build them in the workshop and assemble them and finally put them in these containers as you can see. In the end, when they are installed at the client, Ekopak also operates them. And that’s of course where the whole data story starts coming in. And you see this is a container that’s being dropped off at a company called Delta. They are a frozen vegetable producer. So, they’ll definitely have a need for certain water specifications and Ekopak does that with this container.
Frederik Van Leeckwyck: 00:06:34.343 And we’ll see later on in this presentation that Ekopak is also looking into financing, and we’ll see more about that later when we talk about the business model. So that’s the part about Ekopak, a little bit about Factory. So, what we do in a nutshell, not with a lot of marketing stuff, what we do simple is data collection and data integration for the process industry. So, the process industry can be characterized as the industry that has long-running processes. So, this is not discrete. This is processes that keep on going typically by processing liquids, for example, or producing energy, those kind of things. And in that area, we have three products, Factry Historian, which we’ll talk about today for data collection from the industrial control system. OEE and FactryOS are used in production companies where we’re going to measure the overall equipment effectiveness or where we’re going to help the operators digitize their whole production process. And we’re active in steel, food and beverage, renewable energy, water treatment, as we’re going to talk about today, and textile. And you see this move to the tools that we’re talking about today, and we can see sometimes this is called we have the digital factory industry 4.0, these kind of things.
Frederik Van Leeckwyck: 00:08:07.131 In the end what’s happening is that the OT world, so the classic world of operational technology, of the automation world that is moving ever closer and overlapping in part with IT world where all the big innovations happen. So, the digital factory sits more or less at the overlap between those two worlds. And we always advise there to extract the process data from the industrial control systems as soon as possible because it’s on the IT level that it is very easy, very fluid to get data once the data is there, to get the insights to get to know what should be changed and where things can be improved. So, the digital factory is about bridging the gap between IT and OT. And to give you a little bit of structure on how the data collection works in Factry Historian, I wanted to share with you this slide. So, what you see — what we’re doing, we’re going to collect the data from production equipment. We’re going to store it in the time series database, which is InfluxDB, and in our case, that’s the option that we use. We are going to use Grafana for visualization. And more or less, the components always keep coming back. On the bottom, you see your industrial equipment SCADA systems, PLCs, what you would encounter in any process industry that is controlling things automatically. We’re going to talk with these controllers, with these PLCs, for example, with industry protocols such as OPC-UA with collectors. So, these collectors, they request data from your OPC-UA servers, for example, from the PLCs. Once we get it, we forward it to a backends. These backends will do validation of the data, potentially add some metadata to it, and finally, store it into the time series database.
Frederik Van Leeckwyck: 00:10:10.998 And then there’s an administration interface for the users in the plan, for example, or for Ekopak, in this case, to administer their collectors and administer their [inaudible]. But you see this data flow, it’s all coming down or all the data is being stored InfluxDB in our case. And this is where you see we’ve seen a couple of these components earlier on. We have the collectors. We have the backend. We have the database, and the whole setup really varies according to the use case. So, if you’re talking about production sites or factories, most often you see that we are talking about data collection speeds of about one value per second at one hertz or slower. We’re talking about mostly for a production site, for example, a couple of thousands of measurements where we would be measuring everything like pressures, temperatures, these kind of things. In a lot of cases still at least in our case, this is happening on-premise or in the clouds. And prior, I gave a webinar about how the energy company, ANS Energy, does that. On the right, you see remote assets. Ekopak is a clear example of remote assets, right. So, these containers they are shipped to their clients, and they are more or less everywhere, right. Also there, this is still like a mini Factry in this container. So we’re going to be collecting data at about one-hertz resolution or sometimes slower, a hundred to a couple of thousand measurements. Again, we’re talking about temperatures, pressures here, etc. And here it makes sense to store data in clouds.
Frederik Van Leeckwyck: 00:11:56.968 And we have a couple of other cases as well. [inaudible] really quickly, for example, up to 100 hertz, so every 10 milliseconds we get a data point for an indication of torque, for example, and there it would make sense to store the data on Edge. But here, we’re talking about the middle column today. Remote assets in the process industry so we’ll be storing our data in the cloud. We have this data. What are we going to do with it? And you see kind of like a timeline here. Typically, in the beginning, the first pieces of value that companies in the process industry get from their data is they’re able to look back, and that’s typically the first thing they do. You have the process engineers looking back at — something went wrong, I have the data. I can look at the problem. I can see what went wrong. Then over time, they start evolving to near real-time decision making based on the incoming data. It’s not real-time. That’s what SCADA systems are for. But it becomes close to real-time. And once you’re collecting a certain amount of data and it’s all structured, etc., you can start using it as a basis to predict, and there is a small example of that as well in this case.
Frederik Van Leeckwyck: 00:13:14.878 And then maybe finally, we’re looking for good software engineers to join the team. So if after this presentation you feel like joining, don’t hesitate to reach out. So this is Ekopak’s assembly hall. What you see on the right there is water treatment installation that’s being constructed. So you see these two blue pumps, and behind that, you see a couple of horizontal, white tubes which are filters. And these doing the actual filtration of the water in order to tune the characteristics for the industrial process at their client. So I’ve already given you this introduction about what they do and how they work, how they build, how they design, build as you’ve seen from the photo and then operate their installations. And they’re remote installations. So what’s happening right now is first of all the company is growing, and they’re getting more and more installations. Secondly, their installation size is ever increasing, so they’re getting larger customers, and they have to ensure that there’s water for larger production sites. So that means the complexity of this equipment increases, and therefore, you need more data, a good structure to monitor all of this.
Frederik Van Leeckwyck: 00:14:46.661 Furthermore, if they start having more of these installations in the field, and after a while, of course, some things might go wrong. So you’ll always have maintenance in these things if it becomes ever more valuable to know in advance when a certain piece of equipment will require maintenance rather than being reactive. So because they have all of this data available from all of their installations, they can more easily and actually better plan their maintenance people so that installations keep running for a longer period of time. So these installations are scattered all over, so remote follow up is needed, and it helps the process engineers to root cause analysis if something went wrong. And towards the end of this presentation, I’ll show you why or at least how they used the collected data to make a change in business model.
Frederik Van Leeckwyck: 00:15:46.489 So what were they doing before we arrived at Ekopak? They were doing remote monitoring already with so-called eWON systems as you can see on this graph here. So that’s an eWON module. It’s a hardware module that’s plugged into the electrical cabinet. It can talk to Siemens PLCs, for example, and it has some basic level of data logging capability in there. It has a 4G connection, and it opens up the possibility to have a VPN connection to each device. And if you were to look into a certain problem, for example, you would open a VPN, take control of the HMI so the screen that is controlling the water treatment unit and use the training tool there. But you have to do that for every installation separately. So if you get more and more installations, this becomes ever more troublesome, ever more time consuming to go into each individual machine separately. This also has the capability to do SMS alerting so you can put a couple of alarms on there, and because it has a SIM card, you can get some alert functionality in there.
Frederik Van Leeckwyck: 00:17:03.327 So they were already collecting some data centrally in MySQL or MySQL database with this eWON module. So they were pushing CSC files over HTTP or with an own service by eWON called Talk2M or the DataMailbox where they had a little bit of data already there but not really at a high resolution. It was just like minute-level resolution or even higher, and you could download. It was stored for about 10 days. You could download it. And once it was stored in MySQL database then it was stored permanently but at the headquarters of Ekopak. Once it was there in the MySQL database, they were using Excel to pick a time range, start time, end time, and load a couple of measurements into the Excel file, and then you can start playing around making your [crafts?], etc. So that was kind of the workflow that Ekopak was using before we set up this system. Now the problem with a spreadsheet-based approach is that it is really slow to get the insights that you need. And if it’s too slow, then a lot of cases it doesn’t happen, or it only happens for the most pressing issues. And if we manage to completely maybe that’s not the right word, but if we manage to completely remove the use of Excel, then I think we did a really good job.
Frederik Van Leeckwyck: 00:18:44.089 So Excel, we think it’s rather slow to do your work, to find the problems that you are trying to solve. Also as soon as you load data into Excel, it’s already stale. It’s a snapshot. You’ve taken a snapshot into your spreadsheet, and then you start working with it. But as soon as you’ve taken the snapshot, it’s not up to date anymore. So this is a cumbersome way of working. And finally, we think it’s too flexible. It’s so flexible that you can do whatever you want with it. But after a while, these Excel files start living a life on their own in the organization. And it’s actually not what you want. We want to move towards a single source of truth where all the data is readily available. And as I mentioned before, we consider it a success if we don’t see Excel files anymore. If the data is so easily accessible, so ready at your fingertips, that Excel is not needed anymore. And, hopefully, again, during this presentation, we’ll show how this came about.
Frederik Van Leeckwyck: 00:19:48.415 So let’s get to work, and let’s do so iteratively. So the first step that we did was setting up this Factry Historian system with the InfluxDB. And in order to not shake things around too much, we started by importing the existing data. So all the data that was already available from the Talk2M, the DataMailbox from these eWON modules, we retrieved all that into InfluxDB and then showed how you can visualize all of that in Grafana rather than going to Excel to visualize or to get the data that you want. So we didn’t really change a lot in the data pipeline. But now at least the data is stored in InfluxDB and can very easily be visualized with Grafana and using a database or in 2 and 2.0. You would use a bucket per Ekopak installation in our case. So in this setup, we’ve seen this figure before. That would look a little bit like this. So we’ve written the collectors. Rather than talking to the PLC directly, we would install a collector that talks to the eWON DataMailbox. To do so, we built an eWON collector. And to use that, you need the client API for eWON and the open source that’s on the link below.
Frederik Van Leeckwyck: 00:21:18.987 So the first step so after completing the first step, this is a bit what you get. They were able to see the same data that they could see before, but now it’s but they were doing it in Excel, and right now it’s very, very easily accessible in the web-based interface of Grafana, for example. So what did we win by helping them with this move? Rather than having the data in Excel, right now the data is very easily available at their fingertips. It’s available in a dashboarding environment. This saves them a tremendous amount of time. And because it’s so easy to visualize or to retrieve the data that you would otherwise load in Excel, you actually do the things. You actually go and investigate. You become curious, and it becomes the go-to tool for people to get new insights about what’s happening with their installations in the field.
Frederik Van Leeckwyck: 00:22:18.011 Now second step. What we did so far, we didn’t really add new functionality, we just made it easier to do what they were doing before. In the second step, we started the second iteration. We started adding new installations to the Historian system. So we’re going to start logging data from new installations and mostly, as I mentioned before, typically, from bigger installations. So, these newer installations are shipped with newer PLCs, and these newer PLCs more and more they have OPC-UA server capability on boards like the Siemens S7-1200 or 1500 PLCs. You can just activate the OPC-UA server and then use OPC-UA to get the data out of these controllers. So, this presents us with a tremendous opportunity because once you’ve switched the protocol, things become a lot easier to manage. You’re not reliant on this eWON DataMailbox anymore. You can just go to the PLC directly. You can get much higher resolution data. Of course, you should always check whether that’s relevant. It doesn’t make sense to log every second the temperature of a big vessel of water. The temperature is not going to change every second, so it doesn’t make sense to log that at that resolution. But still, we have the possibility to go for really high-resolution data. And we’ll see with, for example, alarms and events, that this is really crucial.
Frederik Van Leeckwyck: 00:23:54.282 So to speak OPC-UA in the water treatment installation, we preconfigure so-called Revolution PI devices which are small industrial-grade Raspberry PIs where we installed our OPC-UA collector on the device. It’s preconfigured to get its configuration. So, what Ekopak does, they have many of these in their headquarters. If they’re building a new electrical cabinet for a new installation, they just take one of these Revolution PIs, plug it in, and make sure it is network-connected. And as soon as the installation starts up, this is a bit what the data flow looks like. So, the RevPI is plugged into the new installation. The Revolution Pi with the OPC-UA collector always keeps trying to find our backend over the 4G connection which is running in the cloud. As soon as it manages to reach the backends, the collector signs up and requests its configuration. So, in the administration interface, you then give the IP address of the or the hostname of OPC-UA server, you load the tags or the measurements that you’d like to retrieve from the device. And a couple of seconds later, the data collection starts, and there’s buffering on the device of the resolution Pi.
Frederik Van Leeckwyck: 00:25:22.464 So this data flow has now become much simpler. We get more control. It’s a much higher resolution, and in the cloud, everything more or less stays the same. We’re also storing the data in InfluxDB and still visualizing all of this with Grafana So that’s why I’m showing you the same screenshot. We get the same type of data, but it is at much higher resolution and administration becomes a lot easier. So, what did we win here? We already won process data at our fingertips. That’s still the case, but right now we get higher-resolution data. We get more data and simpler data pipeline and standardization in the name. So, what kind of problems does this solve right now? Where are we now? I mentioned in the beginning of this presentation it allows those companies to look back. Process engineers, if something went wrong, they want to be able to look back at the data. And in a lot of cases having high-resolution data at your fingertips is a useful tool there. And we’ve also seen that they are ever more interested in having preventive maintenance. So, they’re using this data already as a basis to take preventive action.
Frederik Van Leeckwyck: 00:26:44.252 The next step then because we have this high-resolution data coming in, we can start to use that as a system to act on alarms and events. Because the data is so fresh, so up to date, near real-time, we can act on the alarms in the cloud rather than acting on the alarms on the control level, or in addition to acting on the alarms in the control level. So, we didn’t do that in the dashboarding environments because at that time and I think still now, this was not supported in template dashboards. And even in these small installations, you can get up to a couple of hundreds of alerts. And this becomes really difficult to manage to say, “If I get the if I get this alert, then I should send an email out there. And this alert, another email, and this alert, etc., etc.” It becomes really, really difficult to keep that managed. So, we’re going to try to template this. And this was not possible yet in Grafana. So, this is an example of how that looks like, how this data flow looks like. So, in this case, we’re going to fire an alarm, or we’re going to treat an alarm for a temperature that’s, for example, too high. And we have two measurements coming into InfluxDB. We have one measurement that looks like the name UPC there, AL-RO1TT, etc., etc. And then we have the analog value as well.
Frederik Van Leeckwyck: 00:28:20.998 So the measurement we see right here, just does one thing. It goes to one, true, or high when the alarm is triggered, and it goes back to low or false or zero as soon as the alarm event is over. That’s just 1010 triggered the alarm or not. Then we have the actual analog value of the temperature, 18 degrees centigrade, 22 degrees centigrade, etc. And it’s this analog value that finally triggered the alarm. So, to make this templatable, we have to have a good naming. And you see right here that al stands for alarm, RO1 stands for reverse osmosis unit one, TTO1.1 stands for temperature transmitter 1.1, and HH stands for high high. So this is a really important alarm. You really need to act on this. So, this alarm is triggered at certain point in time. It is high, and then at a certain point in time later, the alarm shuts off. So, what do we do? The alarm is high, we extract the name of the analog sensor. We put it into the URL. We include the data source. We include the tag there and/or from and to timestamp. This complete URL is then sent in an email or an SMS or direct message. And as soon as you click on that, you go to a dashboard. That’s a simple template that gives you a view, what’s the analog value that triggered the alarm. And just by doing this, you can load, for example, a couple of hundred tags, a couple of hundred measurements that you would like to keep track of in each installation because the naming follows a certain convention. You have immediately all things in place to act on the alarms and to send emails, SMSs , etc. And as soon as you click on those links, to go to a dashboard that shows you the value that triggered the alarm.
Frederik Van Leeckwyck: 00:30:26.025 So this is super simple, but it actually saves you a lot of trouble. It allows you to it allows them to very easily give me the data that’s triggering the alarm. And this happens in a matter of seconds. And that’s why the importance of naming is so it’s something I’d like to bring to your attention. If you name your data right, if you can [inaudible] what we typically propose if you can’t follow a hierarchical structure for example client underscore or installation sensor ID, kind of from top to bottom, if you do this well and you keep if you think about this well in the beginning, there are a lot of benefits there in terms of templating for alarming, for events, for dashboards, you name it. Really, I can’t stress this enough if you get your naming rights, it’s going to pay dividends over the time that you’re using all this data. So here we are acting on alarms and events. So that’s why I think we belong right now we’re here. It’s not real time, but we’re very close to real time. Maybe a seconds after the event this is triggered. So, what did we win here. With the data coming in, we can trust the data coming in, and we use that to constantly look for alarming events and link to a dashboard. And then because we have the data there, we can make reports towards the end of the month for example or for the people that are involved in improving these devices. What is the count of the alarms? How many times did we get this alarm, or what’s the total time that this alarm was triggered? These kind of things become possible because the data is there.
Frederik Van Leeckwyck: 00:32:14.932 Then in the next step, you often see that at a certain point in time people start I mentioned earlier. So, this data and these dashboards, they make people curious. And that’s when new questions start popping up. They ask new questions, new insights. And a lot of cases, then you’re going to go further than just raw data. You’re going to start interpreting raw data. And I believe this screenshot gives you a good example of that. What you see are two graphs that display the pressure across a filter. So, on the first graph, you see the pressure of a pressure sensor before a filter in their installations, and on the bottom graph, you see a pressure sensor or the values of the pressure sensor after the filter. And if you look at these graphs, I don’t see anything. I just see somewhere around three and that’s all I see. But I can’t draw any conclusions from these graphs. Now, this was one of the first use cases where we used Flux by calculations across measurements. And if you’re just with this query, if you use this and just subtract the two pressures from each other, what do you get? You get the pressure drop across the filter. And this graph makes a lot more sense. You can see that over time this pressure is increasing which means that your filter is increasingly getting more and more dirty. So that means you need to backflush it, or you need to be you need to be replaced, for example. So while these two first graphs, they might not really tell you a lot. Just by subtracting them, you get the value here. And it’s much easier to do this in the dashboarding environment than configuring all of this on the PLC, we think. You can do all that logic afterwards and just get the raw data out first.
Frederik Van Leeckwyck: 00:34:26.411 And a second example of that is this case, for example, is a bit more difficult. What you see on the top is just a graph of statuses, so the PLC is in a certain state. Typically, they have status codes. I don’t know if you can see this, but here the status code goes from 0 to 160. But in all, to be [inaudible] all of the value [inaudible] interpreted status is one. And then all the other cases it is zero. And that’s why you get this block graph at the bottom. Now, why do we need that? We need that because sometimes if you look at now, the second part of this case, on the top, you see raw sensor data. And in the middle, you see a lot of spikes. And if you were to compute, for example, “Give me the average temperature across this whole time range,” you would actually if your question would be, “Give me the average temperature when our machine was in production,” you would take this jitter with you, and the average would maybe not be the right average. Or give me the minimum or maximum something like that. So you want to filter out the data that is not relevant to you. And we did so before. We said everything with status 5 and 10 means that the machine is in production. So, if I want to calculate the average temperature for example in case my machine is in production that’s how you would do that.
Frederik Van Leeckwyck: 00:36:17.759 And we’ve written a blog about how you put these queries together. If there is an interest, we can share those links as well. So, by just doing this and preinterpreting the data rather than just having the raw data available but just doing a simple calculation like subtracting two measurements or only visualizing the data when in the condition that all the data is fitting certain conditions, we dramatically simplify the process engineer’s job. The data is reinterpreted for you. It allows you to very quickly see what you want to see. When we get feedback that it is very useful for new people to get up to speed very quickly, you dig deeper into the data, you eliminate a little bit of the repetitive work and start getting closer to actual insights rather than raw data being visualized. So, what we do in here? We get a lot faster and better insights pre-interpretive dashboards are excellent for new hires, and it helps with knowledge retention. And somebody more experienced in the company can put these pre-interpretive dashboards together. The Flux query language comes in really handy there, and this helps in the end to keep the knowledge in the organization and to get people up to speed quickly.
Frederik Van Leeckwyck: 00:37:39.421 And then of course, because all of these are web-based technologies, it becomes really interesting and very simple actually to share data with their clients. So, in their case, they make dedicated Grafana organizations per client. The data is stored in separate InfluxDB databases so they can very easily share a certain set of data with one client and for another client share another set of data. So, things like water quality, availability of the installation to report on the [inaudible] and even invoicing can be done with these shared dashboards, for example. And of course, because the data is available, you can potentially integrate back with the client’s Historian or SCADA systems either through OPC-UA directly or the other database and then back to the control layer. So, what have we win here or at least what the Ekopak win here, they win customer intimacy. They can show to their clients how they manage their installations. They can show their clients that they’re on top of things, and their clients can even look at the data themselves. And this is definitely not trivial. There’s still a lot of improvement potential there for a lot of companies.
Frederik Van Leeckwyck: 00:38:55.769 And then the sixth step we help them do is to make the move to water as a service. Factry only does the data part here. We’re definitely not responsible for the whole move of them towards what [inaudible] serviced. So I mentioned in the beginning that Ekopak’s mission is to enable this transition to what we used to more sustainability, right. So, what are they doing there? Actually, they’re changing their business model, and I think the Rolls-Royce business model is something that’s very close to the Ekopak is doing. What does that look like? Now, the default Ekopak business model looks a bit like this. On the left. you have the organization. An organization that’s Ekopak. That’s the company. What do they do? They sell [inaudible] the water treatment installation in the container. They sell it to their clients. They’re also operated. And, of course, the client then pays Ekopak for the installation. But they’re making a shift. They’re saying, “Okay, water as a service, we’re not going to sell the equipment anymore. The equipment stays ours. We’re going to put it there, and we’re going to invoice them based on the amount of processed water that they requested from the equipment.” And to do that, you need good data. Of course, you need reliable data coming in to know how much cubic meters of treated water have been delivered to the clients. And on based on that, they can do their invoicing.
Frederik Van Leeckwyck: 00:40:40.473 So what do you need to be able to make the switch from switching to make the switch from selling the treatment units to selling treated water, you need an Ekopak site expertise in designing, building, and maintaining these water treatment installations as we’ve seen before. And you need, of course, financing capabilities. And our part in this, data collection with the tools as discussed today. So we’re using InfluxDB and Grafana and OPC-UA, etc., to make sure that this data collection happens in a trustworthy way. And this is an example of how this could look like. So they can build a dashboard. They can show how much work they delivered for a certain client and then use that every month to send an invoice. So many cubic meters of water, “Dear client, this is your invoice.” Right now, you can see this is a dashboard. The client can take a look at it themselves and the invoicing department can also take a look at this. But of course, it doesn’t really take you a lot of imagination to make sure that to see that this can be entirely automated. The data is essentially available. We trust that it is correct. It’s coming in fairly reliably. So at the end of the month, we know how much water we have produced, how much water is being taken. So automatic invoicing can be done by integrating with the ERP system. We can say at the end of the month so many cubic meters send the message to the ERP system, and the ERP system does the invoicing.
Frederik Van Leeckwyck: 00:42:33.480 And then finally, we can start looking into the future. And we get this response from other companies in the market that this kind of technology stack is ready for AI machine-learning projects because you see that in a lot of cases those projects a lot of time sometimes up to 80% of time is spent in preparing all the data in order to get the project to get the project start, to get really the AI step started. [inaudible] centrally available, well-named, and in these kind of modern IT technologies, this really helps to get these kind of projects off the ground very quickly. The data scientists they know these technologies. They can get to the data easily, and this is a tremendous accelerator for things in the future.
Frederik Van Leeckwyck: 00:43:29.913 So I want to share a little bit of the business outcomes for Ekopak. We’ve seen that they are able to do root cause analysis of problems much faster than they were doing before. It’s become a really useful tool for a lot of people, the process engineers typically in the beginning. Sales is using it. Accounting is using it. They are using the data to schedule maintenance operations. All these kind of things are now much easier because the data is there, and it’s easily retrieved, easily visualized. And you can see that data moves from becoming a problem-solving tool there’s a problem, I’m going to take a look at the data, and it becomes an asset. It becomes something you can build on, you can improve as an organization and even make the change to a completely different well, to a completely different business model, which is not easy at all.
Frederik Van Leeckwyck: 00:44:32.114 So the takeaways, what I’d like you to have learned during this presentation is that you should forget Excel. You can do all of this without touching Excel. Once the data is there, just make sure that you can visualize it easily, work iteratively. All of this was not planned in the beginning. We had a couple of ideas and a couple of thoughts with Ekopak of course where they wanted to move to. But the questions and the the questions come because the curiosity comes. So this is kind of like an iterative process that’s keeps getting stronger and stronger. As soon as the data is there, these well-structured things move on very quickly. Think of your naming structure. I believe I stress that well. Make sure that this is a really well thought about, on point. This will be very valuable in the future. Shifting to a data-driven approach takes time. All of this cannot be done from day one. You need the whole organization to move in this direction, and we can see that and I think we’ve proven that data has a clear impact on the business by for example making the change in business model. So with this, I’d like to end this presentation.
Chris Churilo: 00:45:48.618 You have a lot of questions, Frederik. So did you
Frederik Van Leeckwyck: 00:45:53.304 I hopefully answer some questions.
Chris Churilo: 00:45:54.805 You did cut out on a couple of slides, but I think we got the overall understanding of them, so. Initially, I was going to interrupt you, but then I thought you covered what you were intending to. So let’s go through the questions because there are many, and I think they’ll also help to restress the points we might have missed. So Jerry started out with quite a bit of energy, and he asked what are some other stream protocols that are supported natively in this Historian? Are ingest protocols able to be added by a developer or user?
Frederik Van Leeckwyck: 00:46:29.050 Okay. We standardize as much as possible on the industry standard such as OPC-UA, Modbus, these kind of things. In the cases where these are not available, it’s happening less and less frequent, but it is still possible. We always evaluate like a builder by decision. There are tools available that help you make this translation from an industry-specific protocol to OPC-UA, for example. And according to our experience in quite a lot of cases, that could be the preferred route. That doesn’t always have to be, and actually, it has happened already a couple of times that we implement protocols ourselves from, for example, some piece of obscure machinery. It is well documented on how you do that. So you can definitely add it yourself.
Chris Churilo: 00:47:25.423 Okay. Jerry also asked, “Is this data Historian able to deal with nanosecond scale, data streaming,” which I think you’ve answered, but I’ll let you answer again.
Frederik Van Leeckwyck: 00:47:35.540 InfluxDB supports nanosecond resolution, so you can definitely yeah. If for example an alarm comes on and you get the timestamp at nanosecond resolution, you’re going to be able to store it at that resolution. Retrieving data at nanosecond frequency, that’s something else. We have no experience with that. The fastest we’ve done is every 10 milliseconds right now.
Chris Churilo: 00:48:07.695 Cool. Jerry also asks, “Are you able to do live data streams versus batch data collectors when you’re talking to the PLCs?”
Frederik Van Leeckwyck: 00:48:17.207 I’m not entirely sure what that means. This is a really, really technical question. I can always hope to get that question answered by someone else in our team.
Chris Churilo: 00:48:31.233 So maybe, Jerry, you can just expand on that a little bit. He also asked, “Is the 100-hertz number a hard ceiling?”
Frederik Van Leeckwyck: 00:48:40.356 No. From our experience actually, we could go faster, but the OPC-UA server is blocking in that case. That’s purely experience for us so far.
Chris Churilo: 00:48:54.456 Cool. Let’s see. Does your Historian support time-synchronized samples?
Frederik Van Leeckwyck: 00:49:00.802 It does. Yes, it does.
Chris Churilo: 00:49:03.230 I think we saw actually a demonstration of that. Let’s see. So, Nanda asked, “What protocols does the RevPI support to collect from field devices?”
Frederik Van Leeckwyck: 00:49:15.113 To have a really accurate answer on that, you’d have to ask the manufacturer which is called Kunbus. That’s a German company. However, why they have several hardware modules that can be used to interface with the specific protocols. But we don’t use them. We just use the compute module or the core module, which is just a small Linux installation where we run our collectors, and this is purely done in software. And so might solve part of your question and the other part might have to be asked directly to Kunbus.
Chris Churilo: 00:50:00.533 Let’s see. Next question is how do you access OPC-UA servers from InfluxDB?
Frederik Van Leeckwyck: 00:50:08.083 If the question relates to how you get data from InfluxDB back into the control layer if that is what you mean, then currently, there are two routes that we follow. Either we expose the data from that is InfluxDB over OPC-UA. So, we create an OPC-UA server that exposes the values. Or we write them as a client, and we write them to a server. Those are two strategies that we’ve followed so far. I hope
Chris Churilo: 00:50:43.721 Yes. And you guys wrote the collector, but you open-sourced it, right?
Frederik Van Leeckwyck: 00:50:48.880 Yes. So, we’ve written an open-source OPC-UA collector that acts as a client to get the data from the OPC-UA server and we also published maybe more proof of concept. That’s something that is production worthy, an OPC-UA server that allows you to expose data in an OPC-UA server notary that you can then connect to with another client.
Chris Churilo: 00:51:19.636 Yes. So you guys can play with it. There’s also a number of other OPC-UA clients that write into InfluxDB that is available. If you go to our website, we have a list of them. And then recently, we’ve also released a Telegraf OPC-UA plugin as well, so check that out. But that’s just to write data into InfluxDB as Frederik noted. Okay. Andres asked, “I got a bit lost in how the alarms are set up. Are they made in the Grafana dashboard?”
Frederik Van Leeckwyck: 00:51:52.354 Okay. Maybe that went a bit quick indeed. We found that Grafana was limiting us to do a proper alerting because there are a couple of hundreds of alerts per installation that should be configured. And to do that properly according to us, in Grafana, you need to put an alarm on each individual measurement on its own and that would take you a lot of administration work. So what we did is by standardizing because the names are standardized, you could do complex event processing on the data coming into InfluxDB, recognize a pattern in the measurement names, and do the logic from here. So this allows you because the naming is well-structured, you can look for patterns with a regular expression for example and then act on the patterns on the data that’s yeah, act on the measurements that fit the regular expression and do your logic from there.
Chris Churilo: 00:52:54.346 Andres, let’s hope that is good. Let’s see. Sandeep asked, “Ekopak charges as SaaS model. How does Factry charge in this kind of an environment? What does your pricing model look like to support this kind of SaaS model?”
Frederik Van Leeckwyck: 00:53:11.899 Good question. The same model more or less. So it would be a bit strange for us to sell InfluxDB. That’s a bit strange for us. What we do is we keep the whole stack running as part of service. So Grafana, InfluxDB, all collectors and we of course share our experience with this industrial connectivity to make sure you get the most out of this total, yeah, setup for data collection. And then further, we help our clients to get value out of it. So once data is there, you start building integrations. You start building with ERP systems and MES systems. And we have a ton of very interesting projects. A good example may be as the [inaudible] webinar that we also did together with the Influx Data that shows you are another example of what we did on top of the Historian as a service.
Chris Churilo: 00:54:13.556 Oh, cool. So I just want to remind everybody the presentation will be made available for on-demand viewing later on today. Let’s see. And Jerry responded that he was good with the answers that you provided about the batch and the streaming. So we’re good there. And then Nanda asked we had a response about the OPC-UA question the other way, but I think you explained it both ways. So either bringing data in or exposing that data provided both of those examples. Let’s see. So Gerkan asks, “Can it be used MariaDB SQL instead of InfluxDB for Historian database?”
Frederik Van Leeckwyck: 00:54:53.252 That will depend if you’re really from my experience many, many years ago, I was storing sensor values in such a relational database and also after a couple of days of data being in there, it started became really slow it started to become really slow. So technically, it is possible, but I don’t think it’s a good idea. Definitely, if you’re moving to production systems, I think you’re going to get stuck.
Chris Churilo: 00:55:28.926 Yeah. I mean, of course, I’m biased because I work at InfluxData, but you can use anything. You can use a spreadsheet. You can use whatever database you have on hand. It can be a traditional database. It can be Elasticsearch. It can be MongoDB. I mean, you can use anything to store time series data really. You just have to put in a lot of work to make it work for you. And so it’s just a matter of how much time do you want to dedicate in creating and managing something to become a time series database, or do you just want to use something that’s already there. In Frederik’s presentation I mean, even in the Factry Historian, they use I think a Postgres database, right, for your contextual information and orders and all that kind of stuff. And then just for time series data, basically, the sensor data, they’re using InfluxDB. So that’s kind of how we look at the world. And you’re always welcome to use whatever tools and I think the most important thing I want to stress is use the right tool for the job so that you can just continue to flush out that huge backlog of features that you probably have to get through. All right. Let’s see. So now
Frederik Van Leeckwyck: 00:56:50.417 I could go with that answer, Chris.
Chris Churilo: 00:56:53.047 So Nanda ask a little more clarifying questions about the OPC stuff, but I think you covered that. Jerry asked, “What about MQTT support and Kafka’s support?” And I’ll answer that first. InfluxDB itself has support for a lot of different protocols, a lot of mechanisms for bringing data in a bunc
h of these pubsub kind of protocols that are out there, MQTT, RabbitMQ, you name it. The list is long. And they come in the form of Telegraf plugins that the community has built. Alternatively, it’s pretty easy to write directly into InfluxDB. You can also use the client libraries if you want to write directly from your application itself with whatever language you’re programming in. But the good thing is the Telegraf plugins are really just super plug and play. You just modify the configuration to fit your environment and then that data just pops in really quickly. So I would take a look at those integrations that we have, Jerry, and I think you’ll be pleasantly surprised. Alternatively, it’s pretty simple to create your own Telegraf plugin. And I think that’s why we’ve been so lucky and so blessed with so many of these that have become available. Okay. Andres asks, “In a large system with many PLCs, do you suggest to collect and send the data with one single collector such as one Revolution PI, or do you suggest one Revolution PI per PLC?”
Frederik Van Leeckwyck: 00:58:26.140 That’s a very good question, and the answer is it depends. I’m sorry. It depends on the level of criticality. It depends on the amount of tags and measurements you’d like to store, the frequency. All that kind of stuff should be taken into account in order to make that decision. Let’s say, in 60, 70 percent of the cases. But again, taking consideration that we’re not talking about a huge amount of data, you would, for example, use something like Kepware for example to get all your data centrally available and then just use one collector to get it from there. But then of course you introduce one single point of failure that could have an impact. So, there’s no real answer there and no clear-cut answer. It depends on the situation. It could be very logical to split it up and to have one collector it’s always one collector talks to one OPC server. So, if you have many PLCs each with an OPC server, you can also talk directly to them, and that could help distribute it out, for example.
Chris Churilo: 00:59:46.880 We are over our time, but I want to make sure that we go through all these questions. If you have to leave, no problem. And you’ll see that we’re going to say these questions and answers out loud, so it’s captured on the recording. All right. So, we have a question about the Admin UI, and I’m not sure if they were asking about the Grafana dashboard or the Factry UI. So, I’m going to ask this person to maybe give us a little bit more details in that question so we can answer it appropriately. In the meantime, Arkham asks, “Why don’t you use Kapacitor ” which is, for those of you might be new to InfluxDB, another InfluxDB open-source project. “Why don’t you use Kapacitor to generate alerts and send it as email, SMS, etc.?”
Frederik Van Leeckwyck: 01:00:34.930 We could use that. I would have to check why we in the end didn’t go for that truth. Yes. I remember at least partly. We wanted to give Ekopak also some level of control in these alarms. And to do that, we made it very simple for them to so what we’ve done is we’ve written a daemon that does this kind of stuff behind the scenes. And they’re able to control the settings of this daemon in a Grafana dashboard in fact. So, we very quickly realized that in our case it would be much quicker to do all of that in a small program than putting those tools together in another way from our experience. I think what you’re asking is definitely possible with these tools, but we came to the conclusion that in our case it would be faster to write it ourselves.
Chris Churilo: 01:01:56.582 Cool. So, Gerkan responses with, “Well, thank you for your lighting answer. Nobody wants to get stuck in production.” Thank you for that. Let’s see. And Kaitlyn asked, “Does InfluxDB in this context mean you have to give up the open-source code of your application, or was that a company decision to make it open source?” So, let me take a stab at that too. InfluxDB projects are open source. It’s under the MIT license, and that means that you can include it in your production, your fee-based services, products, applications, just like Frederik did at Factry. And it’s just open source. You don’t pay anything to InfluxDB. Of course, being open source, we always appreciate feedback, bug fixes, documentation fixes, additional contributions to code, writing and Telegraf plugin, etc., so that we can continue to make this a very robust solution. Do you also want to answer this, Frederik, in the context of Factry?
Frederik Van Leeckwyck: 01:03:04.372 Yeah. To be honest, we were able to start our company because some of these open-source tools had a sufficient level of maturity so that they could be used in the industrial context. And to share back with the community, we open-sourced some of our projects at the regular basis that started with the OPC-UA logger to get data into InfluxDB that was first published four years ago in OPC-UA server. Recently, we’ve also published Grafana plugin to view measurements in function of distance rather than time because for example if you’re a producer of plastic rolls, on a roll, you’re interested in what happened at my position 10 meters down the roll and not at that point in time. So, these are kind of business challenges that we encounter. And if we think it’s suitable to open source it, we could do so. Also, in this presentation like the eWON client, small project, but still we hope some other people might be able to use it.
Chris Churilo: 01:04:21.243 Yep. That’s the beauty of open source. Okay. Let’s see. So, Tobias asks, “Regarding the naming structure, could you give some examples or hands-on how not to do it and what you’ve learned?” A couple of hands-on typical best practices.
Frederik Van Leeckwyck: 01:04:37.581 Yeah. And how not to do it. That could be a lot of examples. Coming from the industrial context, there are certain standards that we tried to follow as much as possible. One of them is the equipment model in ISO 95. And this equipment model shows you how to hierarchically structure your data either in the form of the name of the measurement or by using InfluxDB tags to give context, to give metadata to your measurements or a hybrid form of those. And you can get through that if you take for example this ISO-95 equipment model as a North Star more or less. It’s not fixed, but if you think about this hierarchical structure, it can help you get a lot done.
Chris Churilo: 01:05:37.538 Cool. And just to remind everybody, the session has been recorded, so [inaudible] you will be able to get the recording later on. And the follow up to the Admin UI question has been sent in. “So, I was asking about the Administration UI that is used as the interface for the backend. Which software is used here for? How does it work in general?”
Frederik Van Leeckwyck: 01:06:00.055 Okay. So, the general flow is that so the collectors, they request their configuration from the central command structure. And the backend, we call it. And that also has a couple of endpoints that you can use to administer your collectors, for example, changing the address of the OPC server or the log in details. These kind of things. Or add a date, delete your measurements that you’re going to collect from each collector. And all of that is presented in a web-based interface where you can load your tags book, import them, edit them, pause them, see what happens. More or less the lifecycle of a tag. It was created successfully, read for so long, then suddenly an error popped up. Then this user changed something. Then it started working again. So, this whole lifecycle stuff that’s all in a graphical user interface so that you don’t have to fiddle around with the command line as in plant operations manager for example in a production plant. You can all do this graphically with tools that are familiar to those people.
Chris Churilo: 01:07:32.080 Cool. That was a nice long list of questions for you today. Great presentation like always, and I really liked I think and I’m not the only one. I think Marieke also agreed that we like that you are treating data as an important asset for your customers. I think that makes a lot of sense because it can do so much in these situations. Frederik, thank you so much like always. I think we all learned a lot, and it always makes me do a lot of thinking after hearing about some of the use cases that you have with your Factry Historian. So once again, thank you so much. And for everybody who joined us, thanks for the questions. I think it always makes it a lot more fun for Frederik and I where we hear lots and lots of questions because I think we could talk all day about this stuff.
Frederik Van Leeckwyck: 01:08:30.421 Yes, for a good Belgian beer. Oh, with a good Belgian beer.
Chris Churilo: 01:08:35.172 All right. Well, here’s to getting rid of spreadsheets. And once again, thanks everybody for joining us.
Frederik Van Leeckwyck: 01:08:42.996 All right. Thank you.
Chris Churilo: 01:08:44.137 Bye-bye.
Frederik Van Leeckwyck
Co-Founder and Business Development Manager, Factry.IO
Frederik Van Leeckwyck, is the Co-Founder & BD manager at Factry.IO. Factry.IO is a solution that provides real-time & historical insights to everyone in a factory environment - from the plant manager to the operator all using open source technologies. His many years of experience in the Industrial IoT industry has made him realize that the industry needed a fresh approach to control & automation.