How eSmart Systems Uses MS Azure & InfluxDB Enterprise to Optimize Energy Investments
Coming soon! Our webinar just ended. Check back soon to watch the video.
Webinar Date: 2018-06-12 08:00:00 (Pacific Time)
eSmart Systems develops digital intelligence for the energy industry and smart communities. Their solutions give grid operators insight into their distribution network which makes energy trade between local markets possible, and monitoring city air quality easy. In this webinar, Erik Åsberg, CTO of eSmart Systems, will share how their solutions use MS Azure and InfluxEnteprise to gather vast amounts of data gathered from sensors and analyze it using advanced prediction and optimization models. This results in a completely new way of visualizing data, while helping their customers make decisions faster to save resources and costs.
Watch the Webinar
Watch the webinar “How eSmart Systems Uses MS Azure & InfluxDB Enterprise to Optimize Energy Investments” by filling out the form and clicking on the download button on the right. This will open the recording.
Here is an unedited transcript of the webinar “How eSmart Systems Uses MS Azure & InfluxDB Enterprise to Optimize Energy Investments”. This is provided for those who prefer to read than watch the webinar. Please note that the transcript is raw. We apologize for any transcribing errors.
- Chris Churilo: Director Product Marketing, InfluxData
- Erik Aasberg: CTO, eSmart Systems
Chris Churilo 00:00:03.184 All right. Here we go. Oops. Let me just fix that. All right. Three minutes after the hour. I promised we would get started. Good morning. Good afternoon. My name is Chris Churilo, and I’m working here at InfluxData, and today, I’m happy to host Erik from eSmart Systems. And before we turn it over to Erik, I just want to remind everybody, please put your questions either in the chat or the Q&A panel in the Zoom application, and we are recording this session, so if you want to take another listen to it, you will be able to. We’ll send the automated email tomorrow morning with the link to the recording. I will post the recording after I do a quick edit later today, and it’s actually the same URL that you used to register for the webinar, so it’s super easy for everybody. So with that, I am going to pass the ball to Erik. Erik.
Erik Åsberg 00:00:53.899 Thank you, Chris. Hi, everybody. My name is Erik Åsberg. I’m the CTO at eSmart Systems. I’m also Microsoft’s regional director, which means I have sort of an ambassador role for Microsoft. We are about a hundred and fifty ambassadors worldwide, so it’s sort of an honor to be that as well. I have a background as software developer. I’ve been working with energy and IT for over 20 years, working with everything from power exchanges to participant systems and treasury risk management. And I’m also one of three guys that help design our technical platform, which I will go more into detail in afterwards. So what I want to do is I want to talk to you about eSmart Systems, who we are, and what we do. I will spend some time going through two of our products, and then we will focus on technology, about our platform, and how we are now starting to use InfluxDB as a primary time series data store.
Erik Åsberg 00:01:56.385 So about eSmart Systems—eSmart Systems is a fairly young company. We’re about 5 years old. We are based out of Norway. We also have offices in Denmark, UK, and in the US. Our core members in our team has about 20 years’ experience in energy and IT. We have been delivering software all over the world through the years, but we got a chance to start fresh with eSmart. And eSmart is based on two trends, really. It’s about the changes in the energy industry, so new types of power-demanding loads like EVs, and the popularity of distributed generation like PEV and wind generation. Those are the changes within the industry. At the same time, there are huge changes in the IT industry, so new technology to handle big data, artificial intelligence, etc. And what we saw that we could use this new technology really to solve these new challenges that the industry has. So that is sort of the basis for our company. And when we started, we had some goals. We wanted to reduce the need for investments in the power grid by really increasing the utilization factor of the existing infrastructure, by providing high-quality predictions, so letting our customers be one step ahead. That is how we would like them to operate. And to be able to really provide high-quality predictions, and high-quality analytics, we need to make use of technologies for IoT and big data. And that is sort of—I hope you now have the background for where we come from and what we are thinking.
Erik Åsberg 00:03:54.359 Currently, I’m in Halden, a small city south of Oslo. I’m one hour south of Oslo, so it’s the afternoon where I’m sitting. How we like to work in eSmart is we like to define three circles. So one that I will focus on today is the data engineering part. That’s really how we provide the data, how we make them available, how we prepare our data for our analytics and machine learning. So the data engineering really is about how we do our sort of software part. The other part is machine learning. Yeah, quite a few of our staff works with data science and machine learning is the focus on our side. We work with deep learning. We work with predictions, mainly, and recommendations. So these two circles combine then with the domain insights. And that is how we like to work with our customers. We like to invite them into our product development, do small POCs, do small functionality pieces that we work very tight with our customers to let them be part of the whole process, and really utilize their domain insight, as well as we have a certain experience as well, being within the industry for over 20 years, but really, it’s what the users need to know. That’s how we really like to work. And within these three circles, so the center of these three circles, that’s where we like to see that really the magic happens. So that’s how we sort of define our process to—yeah, maybe we can call it the process.
Erik Åsberg 00:05:47.764 So now, I will dive a bit into how we use machine learning and big data analytics within eSmart. We have a growing team of data scientists. Currently, we have four PhDs, four CACs, but well, three of them are doing their PhDs. We use a large variety of methods, but focus mainly within deep learning, and we work very closely with the Microsoft expert teams, and I will come back to you on how we work together with Microsoft on several sort of—yeah, we have several interfaces with Microsoft. But also, within the field of machine learning, we push Microsoft as well, and it’s a good place to be in for us. We work very close with their hardware teams in how to optimize on their GPUs, and even actually force them to upgrade their GPUs. So it’s sort of a win-win because we push them to really deliver what they can and what we need.
Erik Åsberg 00:07:00.283 And more precisely within the field, what do we do? Yeah, we work very hard, with load predictions. And that is where time series comes in. We do substation load predictions, EV charging predictions, and predict peak loads after outages. And it was through this work, really, that InfluxDB came up as maybe a source for us for storing and utilizing time series better than our previous solution. And I’ll come back to that as well. Within the analytics part, we do segmentation and profiling. So only through reading, really, usage profiles, we can identify customer behavior. We can identify customers who have acquired an EV or solar panels, only by analyzing, really, the customer behaviors through really their usage profiles. We do risk monitoring in the form of data aggregations. We estimate risk for outages, and estimate risk for smart meter failures. And what has become very popular for our part is the failure and anomaly detection through, really, image recognition and object recognition, and through the use, really, of drones, but it’s not a drone itself that’s the point, but really the image recognition and the software that we provide. So now you know a bit about how we use machine learning and big data analytics in eSmart.
Erik Åsberg 00:08:39.606 So business-wise, we are focused around three business areas. The largest part is the utility part. That is towards, yeah, the utilities, and the infrastructure part of the market. The second business area that we have is energy markets. That’s more focused on the retailers, and aggregators, and sort of the energy part that is not part of infrastructure, but part of how to really sell and buy energy, and the energy that we ourselves consume, or produce if that’s the case. So they also have solutions for prosumers that really consume and produce energy. And we have a small sort of initiative on the city side. That’s more because yeah, it’s more opportunistic. We see that we have certain possibilities because of our technological platform. But that’s not our focus area, but still, it’s exciting to see how we can use the data that we have from the other business areas, really, to provide value in sort of what we define as safety segments. For instance, healthcare. If you know that your grandmother has not used any water or any electricity and it’s now 11:00 AM and you know she’s at home, maybe you would like to have a notification on your mobile saying, “Hey, might check up on your grandmother. She has not used any electricity or water this morning.” So by using or sort of breaking down these silos, we see that there are other possibilities that we want to explore, but still is not our core business currently.
Erik Åsberg 00:10:35.463 We like to break down these three business areas in to six products. I will go through, briefly, the products. Connected Grid™. I will go deeply into that one, as well as connected drone. Those are the two products that we have on the utility side. Connected Prosumer™, really, is about energy flexibility, helping you optimize not as a consumer, but as a provider of energy. Connected Trading™ is a trading backend that we deliver to ETF trading, amongst others. Connected Health™ is on the city side. It’s one of the projects that we have sort of on this side, as well as Connected City™, where it’s city dashboards for pollution in larger cities. Anyway, what is important here is really, we have one platform that support all these products. So the platform, that I will go into details about afterwards, is really supporting all these six products, all these three business areas. And I will go through the reasons why, but really, it’s a generic platform designed to handle time series in real time [inaudible], and by doing so, we see that we can support quite a few business opportunities.
Erik Åsberg 00:11:59.398 What I will do next is I will go through the Connected Grid, which is our most mature product as well. That’s the first product we put forward, and has been the longest in the market. Then I will go through the Connected Drone™, and then we will dive into the technology. So the Connected Grid is really, it’s a single pane of glass into distribution grid operations, including infrastructure and customers. And the reason we are able to do that is because of the integrations, and you will understand that afterwards. But what we want to provide is really one place where they can look into their operations. We do a lot of IoT data management because, of course, we depend a lot on IoT devices, smart meters or other types of sensors in the power grid, and the quality of that data. The quality of that data is, of course, of essence when we try to provide high-quality, both predictions, but also sort of, yeah, the system has advice and actions that they want you, as in operator, to take. We do aggregations, validation, estimation, and editing of values, of course. We have a very flexible workflow editor, which means that every operation within the system is really defined by an editable workflow. That is actually through a UI, so a super user at our customer can really modify the workflows without having to call us to ask for a certain upgrade, or if the energy regulators say that we need to alter this validation process for you to comply with our new rules, then in many cases. they can actually alter the workflow themselves and we don’t have to provide them a new piece of software.
Erik Åsberg 00:14:00.798 And then some alarms, of course, is important. That all happens in real time. Often the data management is not so often real time. It’s more, yeah, everything from every hour down to every five seconds, but the platform is ready for real time as well. But the events and alarms that we collect from these IoT devices—they are always in real time. We do a lot of transformer load management. I will go into detail about what that is. And then, as the extension of this, we do work order management. We like to say that we don’t do work order management from A to Z, but more like A to D, and the reason for that is it’s difficult to really make our customers change their work order system. It’s really something that their field operators has been using for a long time, and it’s not so easy to replace it. So what we do instead is we integrate with their existing work order management, and in that sense, try to again, provide an overview of really the status of their work orders. So for IoT data collection, it is of essence that the data is of high quality, and then you need to know whether or not you have collected all the data from the devices or not. And you need to know what quality is the data that you have collected. So we provide different dashboards to help our customers really get an overview of their data collection. Some KPIs and some statistics, as well as KPIs and statistics about the quality, and also helping them identify which ones did not deliver the quality that they are after. But really, it’s about, yeah, getting the data in, which is sort of the starting point for everything.
Erik Åsberg 00:16:00.064 We do a lot of transformer load management. So in the energy industry, it is important to know that you keep your transformers within their load limits, and especially in Norway where it’s cold in winter, and a lot of electricity is used for heating, these transformers tend to sort of have load challenges, and when they are overloaded, it’s a risk of really an outage. They can actually catch fire or explode. So it is not only important, but it’s really a safety hazard as well. So what we do is we collect data from the smart meters connected to the transformer, but in addition, we can also collect data from the sensors on the transformer itself. So in that sense, we can utilize the smart meter data, and in that way, we create sort of a virtual sensor by aggregating the smart meter data, and compare that to the actual metering that is done on the transformer itself. And then helping them understand which transformers are overloaded, which transformers have free capacity, and in that sense, really help them get more out of their infrastructure. In many cases, we are seeing the customers of ours start to move transformers around because they see they have free capacity in some which they were not aware of and vice versa in other places.
Erik Åsberg 00:17:34.962 And as an extension of this, we can also perform control signals. So we have integrations to smart homes, and really yeah, home gateways that we can control. So we can gather real-time information from the usage in those. We use our, then, predictions to say whether or not it’s likely that this transformer will be overloaded. If we see that it’s a risk for overloading, the system will calculate then how much is needed to shut down to prevent that overload. And then it’s usually slow loads that we use, so heated floors or water heaters, things that you can shut off without really the customers noticing at first, and then the system calculates how much is needed to prevent the overload, and actually then execute that control plan as well, so we really do shut off heated floors and water heaters through, really, home automation systems that we also integrate with. And that’s sort of the extreme part of transformer load management, which is called demand response management. But it’s useful, of course, to again, help them get the most out of their existing infrastructure.
Erik Åsberg 00:19:05.746 As well as this, with new types and latest generations of smart meters, new problems arise for the utilities. For instance, how are their data usage? What’s their cost of data usage? Because with the new smart meters, it’s very easy to say, “I want every part of the data, every data point that that smart meter is able to produce.” And the challenge is that at some point, there probably is a mobile connection included. So at some point in that sort of communication line, there is a mobile connection. And when you say that you want every data point possible, you will have cost issues. So we help our utilities get an overview of their data usage, and what you really see here is in your upper left corner are the SIM cards that have exceeded their data plan, and in the lower right corner are the SIM cards that are perfectly okay. So we help them identify which SIM cards that really maybe they should either change the data plan, or they should look at how much data they collect for that specific case.
Erik Åsberg 00:20:20.480 Also, I would like to go through one of our use cases that we’ve done with Jacksonville Electric Authority. That really is about water, but it’s the same product that we provide. So in Jacksonville, when a smart water meter has provided zero values for over 3 months, what Jacksonville Electric Authority do is they send a truck out, and they dig up the water meter, and they check if it’s okay. In very many of the cases, the water meter is perfectly okay. And so they wanted a system that really can tell them or really decrease the number of truck rolls because it’s very costly. So what we do is really reuse smart electricity data together with the smart water data, and then use machine learning to say whether or not it’s likely that this water meter is broken or not. And really what happens is they have reduced their truck rolls with over 80% and saving over $700,000 a year just by this simple comparison of data, which really is kind of advanced, but still, it’s only two data points that we use machine learning to calculate whether or not it’s likely that it’s broken. And with sort of the same approach, we use that for district heating. And so we also can do district heating within Connected Grids because again, it’s very many similarities in the infrastructure. So that was short about connected grid. The next project I will go through is the Connected Drone.
Erik Åsberg 00:22:13.661 So the challenge today is the power infrastructure and the power lines has become very long, of course. It’s a lot of lines, and it’s an aging infrastructure, as well as the weather conditions in many parts of the world starts to sort of be challenging as well. So that’s sort of the basis, and the Connected Grid sort of holds asset information about every asset and component in the grid. So as an extension to this, we thought, “Okay. How about getting real-time information from the use of drones, really, to provide accurate information about the status of each component?” So that is what we provide with the Connected Drone. We really do take the images that are captured in the field and do the analytics, map them with the correct component in their component registry as well as give them an overview of all their inspections, all the findings. And this does not only count really for drones, but it’s very much easier to use a drone than to try to get a helicopter, and when you finally have that helicopter 18 days later, the weather is bad so you cannot fly anyway. The drone is so much more accessible.
Erik Åsberg 00:23:42.053 So the current inspection models are, of course, slow. You need to require a helicopter. Takes time, and yeah, it’s still very weather-dependent, and it’s very manual. And in some cases, you have to walk the lines. In winter, you have to use skis to walk the lines. And it’s very time-consuming. It’s dangerous, and it’s a very manual process. And the result, really, is handed to them in form of portable hard disks where they get like twenty thousand images that they have to look into. So what really happens is today at the utilities, they have dedicated personnel looking into these images, zooming in and zooming out, trying to identify anomalies and issues with the pictures. It’s very, very time-consuming, and of course, not only is it time-consuming, it’s vital to really fix these problems as soon as possible. In some places in the world, like in Norway, the utilities are penalized if they don’t deliver power to the end users, so it’s sort of a double loss for them. They get penalized from the state as well. So our idea, really, is could we use AI, and big data analytics, and drones to help them with these problems? And could deep learning really find problems automatically? And really, could we use drones as an eye in the sky? And most importantly, could we really predict this before they turn into a critical problem as well?
Erik Åsberg 00:25:34.944 One of the challenges with this sort of mission that we have put upon ourselves is that the utilities have a lot of pictures for their assets, but pictures of anomalies are not so common, and that is really what we need for our algorithms to identify anomalies because our machine learning needs to know that this is an anomaly. And since there are not so many images of anomalies, what it’s being forced to do is really, yeah, create these anomalies ourselves. So what we did was we use gaming engines to really produce synthetic data that we can train our AI. So we take real components, we digitize them, and we put them into an unreal gaming engine, which we actually train our algorithms from the images that we produce synthetically. And this has proven to be really quite helpful. We can now find a lot of, or identify a lot of anomalies without really having a lot of real pictures. And as an extension to this work, we also, or our AI team, really, has created this game to improve the quality and the realistic look of the synthetic images. So they use GANs, so generative adversarial networks, to try to refine, really, the synthetic images. And then we have a discriminator on the other side who tries to state if this is a real or if it’s a refined image. And then this becomes sort of a cat and mouse game where the simulator continuously improves himself to fool the discriminator, who tries to really identify if this is true or false. And in that way, we really improve our image recognition.
Erik Åsberg 00:27:51.923 And really, yeah, we use a lot of frameworks to do this. We use frameworks from Microsoft’s research, from Facebook research, and from Google research. And really, you can compare this with—we like to compare this with Tesla autopilot, which is capable of identifying if this is a bicycle, if this is a person walking, if this is another car. So what we do is we do the same thing. We need to identify what type of component, or that it is a certain type of component. But we also need to not only that, but we need to identify if there is anomalies on that component. So it would be like if the Tesla autopilot had to identify not only that it’s the car, but it’s a blue Mercedes with a cracked windshield. So we like to sort of say that yeah, Tesla autopilot is good, but we need to be better. So that was the part that I had around our products, and yeah. If you have questions, please ask them afterwards, or reach out to me at the later stage. Now, we will dive into what we really want to talk about, the technology, how does Influx fit into this? What are our thinking when it comes to our data platform? So that is what I will go through next.
Erik Åsberg 00:29:19.746 So really, we have one platform to support all these products, and as I mentioned, we have a background from energy and IT. Some of us has worked with energy and IT for over 20 years. Working in sort of a traditional energy setup with a relational database in front of a Windows, or behind a Windows frontend, and really see the amounts of data growing, and you tune the databases, and the data keeps on growing, and you keep tuning it, and the data grows again, and you end up buying new hardware. And really, what we saw that this really doesn’t scale. So when we had the chance to start fresh, leave all our legacy behind, and start all over again, we knew that we had to go cloud-based. So really, but then the question comes up, “What really is a cloud-based solution?” Is it a cloud-based solution if you take your 15-year-old system, make a few tweaks, and then host it in the cloud? Some of our competitors might claim that that is a cloud-based solution. We think not. I will come back to what we mean, but really, for us, a cloud-based solution is really something that is cloud-born, designed for utilizing the flexibility and elasticity of the cloud. We spent a lot of time understanding the different technologies. And we ended up choosing Microsoft Azure as our primary platform, and the reason for that is really, at that time, we felt that it was more enterprise-ready than the competitors, and we like our choice. Azure has grown very much in the same pace as us. We’ve always found new services when we need them, and we have a very good cooperation with Microsoft as well. So this has turned out to be a very sort of fruitful cooperation, and it’s also, of course, that way that as long as we host this in Azure, when we get a new customer, Microsoft get a new customer. So it’s really a win-win.
Erik Åsberg 00:31:34.542 But really, what we came up with is the concept of a top system. That was sort of our idea. What you see on the lower side there is really a typical ecosystem within the utility. They have a lot of systems that are extremely good at what they do. The challenge is that they are very silo-based, so they focus only on their task, and that is fair. But the challenge is when the work processes start to span across these silos. That is when it’s really difficult, really, to utilize all the data that you really have because you need context from the other systems. And that’s where we come in. So we ingest data from all these line of business systems, as well as directly from smart meters, from sensors, and from drones as you’ve seen, and from external sources like weather information, even social media —we monitor social media on behalf of our utilities—and market information, really to enrich the data that they have. So it’s all about getting more value of their existing data. And then, we have our data model inside this that we use to find correlations, to do our analytics on top of the data that they already have. So our IT really is to build this on a PaaS solution because we didn’t want to spend our time doing a lot of work on the hosting side. We want just to say to Microsoft, “Hey, we need a database,” and we need to make sure that—we need them to make sure that the database is always backed up, that it’s always patched up, that it’s always up and running. We don’t want to spend our time doing that stuff because really, we need to use all our time in making value for our customers on top of those services.
Erik Åsberg 00:33:34.022 So we try to use PaaS as much as we can. This has sort of evolved over the years as well, but what we provide is really business value on top of the PaaS services. And then we offer our data through APIs. We have our own UI that we can provide, but we have customers that really create their own UI as well. And one challenge, really, also that we need to think about is data privacy and data, of course, sensitivity. This is really critical infrastructure that we’re talking about. There are different rules in different countries. There are different regulations. And this is something that we constantly need to think about, as well as data processing agreements, having them available at any time, and making sure that we always comply with what our customers need to comply with because that’s sort of always the challenge with sort of a cloud-based infrastructure.
Erik Åsberg 00:34:38.419 So if we dive even further into the architecture, you will see that—and this is about how it looks. It’s not exactly how it looks, but it’s really close to what it looks like. This is constantly changing. Every icon that you see here, they represent a PaaS service that we consume. They have a certain sort of a functional fit, as well as a price. So we need to make sure that is a very good functional fit as well as the price is right. When it comes to sort of, to give you a few highlights, we base it much on queues and [service] buses. So that is really to make the system data-driven. We want the data, really, to trigger the events, not necessarily a user input, or a schedule running in the background. But we need to really make the system data-driven. That is really possible through the cloud-based architecture, and this is what I mean by a typical cloud-born architecture. This is the sign for the cloud. It can scale at every one of these icons that you see here. Every element in this architecture can scale. And as you see, we use a variety of storage technologies. InfluxDB is, of course, one of them. That is really pretty new to us. We’re looking at getting this more and more into our architecture, looking at really how can more of the InfluxDB Enterprise fit into our architecture? We have started using Kapacitor. We are looking at how we best can utilize Telegraf as well. But InfluxDB was sort of the basis for our interest in Influx, and it’s turned out to be a very good choice for us. As I mentioned earlier, it came to us through our analytics, and really, we found that this is superior to our previous solution. So that is why we now start using Influx as our primary time series source, really for everything that we do.
Erik Åsberg 00:36:47.799 Yeah, we have a real-time interface as well, using yeah, standard components like IoT Hub and inventHUB. And of course, machine learning is a central part. We used to do a lot of Azure ML. Now, we do not use that so much, but we have always written our own ML algorithms and plugged them into our architecture. We now move into [inaudible] and microservices very much. So what I would like to do next is really go through the principles behind this architecture. We have four principles that we sort of sat down before we started. And the first one is data availability. Really, the data needs to be available, but it’s not really sufficient to say that, “Hey, let’s put up this data lake and put all our data into that one.” Yeah, you get the data available, but really what added value do you have? It’s available, but still, you need to have some sort of structure. If you still have the same sort of chaos that you had before, then it’s really hard to make sense out of the data. And one more thing is that really, you should utilize all types of technology to make your data available. There’s no one silver bullet that will solve this for you. Really, it’s about choosing the right technology for the right type of data. And that is really possible now because it’s a wide variety of different technologies, as well as storage is not the cost element, really, anymore. So having duplicates is really no problem at all. So making data available, but really you need to think about the technology, and you need to think about your data structure within your sort of lake.
Erik Åsberg 00:38:48.151 The next principle is user time series. We have an extreme, wide definition of time series. So of course, metering values from sensors is easy. Everyone considers that time series. We also consider the alarms coming from the IoT devices as time series. We also consider messages from customers as time series, even tweets, or even images from drones as time series. And the reason for doing this, really, is we want to establish a timeline. So when we see that this device reports an increase of this and these values as well as these tweets coming in, then this alarm happens. Then we can tell that, “Okay. Something is about to happen.” And our machine learning and analytics will really help us be better at understanding what is about to happen now because we have established a timeline with this extreme definition of time series.
Erik Åsberg 00:39:54.025 The next principle is user graphs. So graph, as you all know, is really the technology that is behind Facebook and LinkedIn, keeping track of who’s connected to who, which groups are you in, etc. We use the same thing, of course, to get an overview of the infrastructure itself, but that’s easy because that’s all the physical connections. The IoT devices do not necessarily communicate in the same way, so we have another layer in the graph that tracks the communication paths between the devices. And we have another layer saying, “Okay. How are these assets connected economically?” So do they have contracts, which have contracts with whom, etc. And in that way, we get a sort of complete picture of all the connections between the assets in so many layers. And that is also sort of very important for our analytics. When we have the timeline, and we know the connections between all the assets, that is a very good starting point.
Erik Åsberg 00:41:02.308 And then, of course, it’s integration, so that’s sort of part of data availability, but it’s so important that you do not lock your data up, and that’s one thing, but you also need to sort of get hold of the data as well. And there are a wide variety of technologies now that you can utilize, as well as standards because really, it’s about using the standards that the industry provides. In the energy industry, there’s something called the common information model, CIM for short, which really is sort of now starting to pick up amongst all of the different vendors in the industry. And that makes it easier for us to share data between our systems. So yeah, we have the top system concept with our four principles, which really is sort of, yeah, a cloud above their existing infrastructure. If we would look more into logically how this pulls out, then really, it’s sort of what you would say, a layer between their line of business systems and their IoT devices. That’s where we sort of reside. We get the information from their line of business systems, as well as from the IoT devices. And that is where we can provide our magic.
Erik Åsberg 00:42:28.104 Just short now towards the end, what we do with Microsoft. So we have a very tight cooperation with Microsoft. We have had that since the beginning. It’s sort of evolved over the years, but I think the reason was that we, very early, did our architecture based on Azure, based on PaaS and over the years, they had used us in many blogs and use cases. And if you search eSmart Systems in Microsoft, you will find a lot of stuff with us there. They’ve also had the chance, really, to talk on their large stages or on their large conferences, and we [inaudible] very closely with their core teams in Redmond, the Azure teams, the AI teams in Cambridge, and the IoT teams in London. So it’s very beneficial for us, of course, to have that sort of cooperation because we get access to the newest long before it’s really available for others. But also, we can provide feedback to them saying that this will not work. The price model of this service just really doesn’t fly. We can push them on hardware, as I mentioned. But now we talked about sort of the technical part, but it’s a very important business part as well because a partner like Microsoft is really dependent on having use cases that they can show to their customers, and that is where we come in. They can say, “Hey. We have this partner eSmart. Look what they have done.” And again, it’s a win-win because when we get a customer, they really get a customer. So it’s a good place to be in. Yeah. That was really all I had for this presentation, so Chris.
Chris Churilo 00:44:29.338 Yeah. So I actually have a couple of questions, but before we get to my questions, I just want to remind everybody if you do have a question, please feel free to throw it into the chat or the Q&A panel. If you just go in through your Zoom application, you’ll find those, and you can type your questions in. So my questions are pretty simple. So how did you first find out about InfluxDB?
Erik Åsberg 00:44:51.837 Yeah. That was interesting because we had our own solution, really, based on Azure Blob Storage. We had a few challenges, but the performance was very, very good. But then our AI team started looking at our predictions, and more predictions at scale, and we had some challenges with our existing solution. And we had done a really thorough evaluation 5 years ago, and then we found that our AI team, they needed to do a new evaluation of what’s out there today. And it became excitingly quickly obvious that InfluxDB was the one that we needed. So it was really through the AI work that we did that Influx came up.
Chris Churilo 00:45:45.341 So what are you actually storing in InfluxDB?
Erik Åsberg 00:45:50.194 It is time series, really, but as I said, we have a really wide definition of time series. But then it becomes more like sort of pointers. So we do not store an image. We store the image still in, yeah, that is actually in the blob storage as well, but with a reference into Influx. So it is really a timestamp and a value, either a value in form of the pointer, or a real value [inaudible].
Chris Churilo 00:46:22.261 So that way, you can then use those two sets of data to be able to understand if there was a change in the actual thing that you’re looking at versus the image?
Erik Åsberg 00:46:34.060 Exactly. So we can then compare what really happened around that time of the image, or at the time of that tweet. Were there certain values? Were there certain alarms that were triggered at the same time? So we try to establish that timeline, and Influx is really core in that timeline.
Chris Churilo 00:46:55.439 I have a few more questions, but I’m going to let Mark Wong ask his question first. So he says, “Hi, Erik. Thanks for sharing. Can you share some details of how you run InfluxDB at scale on Azure?”
Erik Åsberg 00:47:08.279 Yeah. That is something we do then using the InfluxDB Enterprise. And we host our own clusters. And the reason for that is really—it was the choice between letting Influx host it or ourselves. The challenge with our industry is really the data privacy and things there. We need to make sure that our Influx clusters are within our data processing agreements. So we host our own clusters, really, based on InfluxDB Enterprise and yeah. It’s not a managed service. We host it on the [inaudible], and I can certainly provide more details on that one if you like, but it’s important to us that we have it within our boundaries of our system because of, really, the data privacy issues.
Chris Churilo 00:48:18.377 Mark, did that answer your question? Just let us know. So I actually have a follow-up question to that because you mentioned a couple of times that data privacy is pretty important. So what kind of—oh, and Mark says yes, it did answer his question.
Erik Åsberg 00:48:34.056 Good.
Chris Churilo 00:48:34.433 What kind of data are you collecting that is so sensitive, because maybe where is the personal information being collected when you’re talking about these grid systems? Maybe you could just articulate for us.
Erik Åsberg 00:48:48.435 Absolutely. So it really is about the smart meters. More and more people in the US as well as in Europe are getting smart meters. Some of the latest generations of smart meters can really give you a lot of information. And we get a lot of information about the usage within, and that becomes, really, a private household. We get information about the energy usage, the voltage levels, all the issues that they have. By analyzing that data, we can actually find out when the washing machine started or not. So really, we can find out a lot only by analyzing the usage, the power usage, within the household, and that’s where the privacy comes in because this is, at some point, connected to the address of that household. So if you get hold of that data, you can really find out a lot about what happens within that house.
Chris Churilo 00:49:50.098 That makes sense. I mean, I guess, at first blush, I wasn’t thinking that energy could tell a story about myself or my household, but you’re right. It can. It can actually tell you whether the person is at home or not, or how much they’re consuming, etc. So that makes a lot of sense why you have to be so careful with that data that you’re collecting. So another question I had is, so what are some of the—how do I say this? So it seems like time series data is pretty central to machine learning and AI. Maybe you could talk a little bit about that. At least, that’s my understanding. I mean, it seems like if you have the data that tells you the rate of change of something, then that is really central to being able to then take that historical data and then make predictions on it.
Erik Åsberg 00:50:43.420 Exactly. So machine learning works that way—that to start training your algorithms, you really have to have a data set where the answer is contained. So you need a data set with the answer included. And now, technology makes so much data available, especially time series because it’s really sort of an easy data structure. It’s not very complicated. And it’s easy to get hold of a lot of time series data. So when we do our predictions or we use time series data in form of metering from previous or historic meterings, as well as weather information is typically one element, as well as a calendar, whether it’s a weekday, whether it’s a holiday, etc. So those three things are sort of the basis for our time series analytics, and yeah. The availability really is the key thing here. That is why it’s become so sort of easy for us to make good predictions with the type of competence that we have amongst our data scientists.
Chris Churilo 00:52:01.204 So you mentioned that you still had plans to potentially use Kapacitor and Telegraf—so how are you ingesting data into InfluxDB today?
Erik Åsberg 00:52:10.038 Through the standard information that is available. So actually, what we have done is we have our own services that ingest data, and we’ve just rewritten sort of the backend of that service to make it fit with Influx instead of the previous storage that we had. So we’ve just sort of, on our front end of that API, nothing has changed. So our services still run as they used to do, but the storage is different. But that is sort of a suboptimal way of doing it because you have things like the Kapacitor and like Telegraf, so we’re now looking into, okay, maybe our sort of existing ingestion service is not the way it should be since we have changed the backend. So we’re now looking into how to better, or improve that, by including Kapacitor and Telegraf.
Chris Churilo 00:53:07.799 Excellent. So just for everybody on the call, so how much data are you ingesting on a—pick your interval, day, minute. How much data is coming into InfluxDB?
Erik Åsberg 00:53:20.215 Oh, that’s a lot. So only in smart meters, you have 24-hour use for your smart meter every day, and we collect data from, yeah, several million sensors. So that is twenty-four million a day, only by that. But in addition to smart meters, we have sensors who collect data in a much higher frequency, yeah, down to seconds. So we haven’t actually done that calculation of in 1 day, how much data do we collect, but we know it’s quite a lot.
Chris Churilo 00:54:00.143 All right. We have a question from Alain Castonguay. He asks, “Ingestion—does that mean you have some VMs and services doing ETL between stream analytics and InfluxDB in your diagram? Or did you convince stream analytics to target the InfluxDB/the right API?”
Erik Åsberg 00:54:18.503 No, yeah. We have done sort of a shortcut now. So our ETL is done within our existing services. So we’ve just sort of exchanged or replaced the storage itself. So our ETL is still being done by the existing services that we have. I’m not quite sure if that was an answer to the question, by the way.
Chris Churilo 00:54:46.771 Alain, if you could just let us know if that answered your question. So when you’re collecting the sensor data, is it typically data that changes all the time, or I mean, is it values that are going up and down all the time, or are there values that you’re trying to see that are at a constant rate?
Erik Åsberg 00:55:07.550 Yeah. That’s a good question because this comes into the cost perspective of things because if you know that it’s the same value, nothing has changed—if we collect that data, say, every other second but it’s the same value, it becomes sort of a cost issue. So especially on the high frequency ingestion, we have rewritten some of our ingestion only to collect changes in data. So we only collect the change, while in other parts like smart meter data that needs to go to billing, it’s important to have every value present. So it differs a bit, but for the high-frequency, we only try to collect the changes. But for smart meter data that, yeah, goes to billing, we need to have every value present.
Chris Churilo 00:56:01.499 So then that billing information, do you have to keep it forever or—?
Erik Åsberg 00:56:05.574 Yeah, yeah. That depends on the regulations, really, and the agreements between the customers and ours. We are very aware that this is not our data. This is our customers’ data. And at the moment, we do not delete any data. We just keep it with a new version. So we have a complete track of every value and every data point in our system. But that really depends on the agreements that we have.
Chris Churilo 00:56:35.646 Cool. So Alain just responded and said, “Yes. Thank you. You answered my question.”
Erik Åsberg 00:56:40.138 Very good.
Chris Churilo 00:56:40.270 So for that. So we are one minute till the top of the hour. Maybe you can just share with everybody on the webinar two things. One is: What advice would you give somebody starting out with InfluxDB? And then the second and final question is: Just share a little bit about what are the plans next for eSmart Systems.
Erik Åsberg 00:57:07.001 Yeah. So for us, it was—we needed to make sure that it was a good fit when we started using Influx. We needed to make sure that we had the right technology that it could meet our performance requirements. And then we started looking into it. But really, to have a good dialogue with you guys, with the Influx team—use the team. It’s worth the money because what we have seen, or what we have experienced is that we, by just picking up the phone, have two hours with the tech guys at Influx, we learn so much more than trying to understand the details ourselves. So sort of from our experience, use the tech guys available at Influx. They are super helpful. And for our plans further, when it comes to the use of InfluxDB Enterprise, of course, looking into—and Kapacitor. We have started using that, but looking into Telegraf as well, and yeah. We launch our products continuously. We are now currently doing line infrastructure for Alliant Energy in the Midwest, and currently, yeah, trying to expand into the US. That’s our business plan.
Chris Churilo 00:58:37.607 Excellent. Well, thanks so much, and if everybody on the call has any other questions, feel free to shoot me a note and I’ll be happy to forward the questions that you have remaining with Erik. Erik, thank you so much for a great review of your product, and how you use both the Microsoft Azure products as well as the InfluxDB projects. We really appreciate that, and I think I appreciate even more, really the thought and all the engineering work that you guys are doing to make these solutions just so much better for your customers and for the world.
Erik Åsberg 00:59:14.586 Thank you. I thank you so much for having me, for letting me present this.
Chris Churilo 00:59:18.925 Excellent. Well, thanks, everyone, and we’ll do a quick edit of the video, and I’ll post it, and you’ll be able to take another listen to it later on. Thanks, everyone, and have a great rest of your day. Buh-bye.