Gain Better Observability with OpenTelemetry and InfluxDB
Session date: May 02, 2023 08:00am (Pacific Time)
Many developers and DevOps engineers have become aware of using their observability data to gain greater insights into their infrastructure systems. InfluxDB is the purpose-built time series database used to collect metrics and gain observability into apps, servers, containers, and networks. Developers use InfluxDB to improve the quality and efficiency of their CI/CD pipelines. Start using InfluxDB to aggregate infrastructure and application performance monitoring metrics to enable better anomaly detection, root-cause analysis, and alerting.
This session will demonstrate how to record metrics, logs, and traces with one library — OpenTelemetry — and store them in one open source time series database — InfluxDB. Zoe will demonstrate how easy it is to set up the OpenTelemetry Operator for Kubernetes and to store and analyze your data in InfluxDB.
Watch the Webinar
Watch the webinar “Gain Better Observability with OpenTelemetry and InfluxDB” by filling out the form and clicking on the Watch Webinar button on the right. This will open the recording.
[et_pb_toggle _builder_version=”3.17.6” title=”Transcript” title_font_size=”26” border_width_all=”0px” border_width_bottom=”1px” module_class=”transcript-toggle” closed_toggle_background_color=”rgba(255,255,255,0)”]
Here is an unedited transcript of the webinar “Gain Better Observability with OpenTelemetry and InfluxDB”. This is provided for those who prefer to read than watch the webinar. Please note that the transcript is raw. We apologize for any transcribing errors.
- Caitlin Croft: Sr. Manager, Customer and Community Marketing, InfluxData
- Zoe Steinkamp: Developer Advocate, InfluxData
Caitlin Croft: 00:00:00.000 Hello everyone, and welcome to today’s webinar. My name is Caitlin Croft, and I’m joined today by Zoe Steinkamp who’s going to be talking about getting better observability with OpenTelemetry and InfluxDB. Please post any questions you may have in the chat or Zoom Q&A, and we’ll answer them at the end. This session’s being recorded. And without further ado, I’m going to hand things off to Miss Zoe.
Zoe Steinkamp: 00:00:26.302 All right. Awesome. So today we’re going to be going over how to gain better observability with OpenTelemetry and InfluxDB. My name is Zoe Steinkamp. I’m a developer advocate here at InfluxData. Let’s see. There we go. If you want to reach out to me on LinkedIn or add me, feel free, but basically as a developer advocate, part of my job is to advocate not only for our company but also for some of the other open source tech that we work in, like the OpenTelemetry project, as well as just in general to get feedback and such. So first things first, our agenda. So we’re going to first go over an introduction to OpenTelemetry which is going to include logs, traces, and metrics, an overview of the InfluxDB Cloud powered by IOx, so that’s the new version that we just released. We’re going to go over a few of the key features that make it useful for OpenTelemetry data. And finally, we have a project that you can actually pull from GitHub and follow along, or you can just kind of watch my slides. And then we’re also going to do a live demo at the end of it. But basically this will allow you to hook up Jaeger, Grafana, HotRod, which I’ll go into later, but basically recreates traces, it creates fake traces. If you’re like us, you don’t have a server to pull from. And then Telegraph. And finally, more learning resources at the end. So first things first, an introduction to OpenTelemetry. So logs, traces, and metrics. So logs are records or events or messages generated by applications or systems during their execution. And one thing to note here is for the longest time, InfluxDB has been what we call a TSM, been more of a time series metrics database. But now with our new IOx engine, we will be able to store logs and traces. Now the project that we’re going to be going over later here — it mainly focuses a lot more on the tracing side, but we will be also getting logs and metrics as well. So basically, this is the — I guess you could say the circles of where these all come from, but metrics are more like - sorry, guys — aggregatable events, and logging is more like every type events, and tracing is request scopes. That means that when somebody clicks a button on your website, you’re more likely to get a trace about it, whether or not they got a 400 or a 200, whether things went well or things went bad. Logging can still be that button press, but it’s more likely to be just in general more — What’s the word? A trace can be uneven. It could take hours for another person to press that button. It might not be consistent. But logs and metrics are normally a lot more consistent in that they are tracking constantly throughout the day. So these are the three big things that the OpenTelemetry is currently working on tracking. And these are very, very important metrics, especially in things like deployments and dev ops. This is kind of a nice architecture diagram of how this ends up looking. So I’ll go over the key points of the OpenTelemetry project in the next slide, but basically what they’re trying to do is create everything in one space for you. So it used to be that it was kind of a play-it-yourself game of “Hey, pick whatever you want to do your traces, your metrics, and your logging, and maybe some other stuff too while you’re at it.”
Zoe Steinkamp: 00:03:54.521 And you can just use all of these different types of collect and where you stored stuff. And it was very much everyone builds it how they want to build it. Maybe by the tools they already knew or whatever tool is popular but basically people were kind of spread out into the tech that they used for this. The idea with OpenTelemetry is that you use one type of collector, which will grab all of the three different types of data, and it’s more streamlined. So you also know what kind of data to expect because in OpenTelemetry they normally tend to define what parts of logs and [traces?] and metrics you’re going to be storing. Now, that being said, you can store, obviously, larger amounts of data points if you have them. More things, normally, that go to things like location-based data and such. But overall what they’re trying to do is just streamline this process. Because it can be frustrating, as a DevOps engineer to — I don’t know — maybe change a job or when you change different, possibly, cloud providers or something, it can be frustrating to now have to use brand new tooling for all of the same problems. That’s not super fun, so the idea with the OpenTelemetry project is it’s all just wrapped up into this service. So, it’s a single, vendor-neutral binary and a vendor-agnostic instrumentation library. So basically what these wonderful words mean is that there are many different types of vendor binaries and they’re vendor-agnostic. It doesn’t really care if you store your data in InfluxDB or if you’re storing it in something like a sequel database, depending on what you’re planning to do, it’s relatively agnostic. The only big thing, I would say, is that the vendor-neutral collector is technically accurate in that any vendor can create a collector like we have but you do have to follow their docs and stuff. So not every single vendor is going to have an OpenTelemetry collector just because they have to create it themselves. It’s going to have end-to-end implementation to generate, emit, process, and export. So that’s what I was talking about before. The whole thing with OpenTelemetry is that it starts at the beginning of the collection process all the way to storage and it’s all streamlined so you don’t have to do it yourself, to an extent. But it’s still full control of your data, with the ability to send to multiple destinations in parallel. Like I said before, it doesn’t care if you’re sending it to Influx or to somewhere else or, as it is actually quite common, to multiple places. Open standards, semantic conventions. That’s basically just saying that the vendors can easily build up the data collection agent. It’s not necessarily difficult, it’s just something that they have to do. And obviously a path forward no matter where you are in your observability journey. So one thing to note, is why I’m talking a lot about this, is obviously this project is using an OpenTelemetry collector that we built ourselves but also because we’ve been working side by side with the OpenTelemetry project for about, I want to say, two-ish years now. It’s been more intensely working together in the past six months. I would say but we’ve been following the project and some of our engineers have been making commits and such for over the past year. So now we’re going to go into an overview of InfluxDB Cloud. And this isn’t going to be a massive overview or anything like that. Like I said, we’re going to be focusing on those key features that make it relevant for this type of data. So first thing’s first. The new engine is built on top of Rust, Apache Arrow, Apache Parquet, Arrow Flight, and Data Fusion. So what these allow us to do is, Apache Parquet is a parquet file format and Apache Arrow allows us to be built with SQL in mind. So the idea is that we’re storing files in that parquet file and it allows us to be able to integrate with more connectors. And it also allows us to store in a more efficient file format.
Zoe Steinkamp: 00:07:55.225 And with Arrow Flight and Data Fusion, it allows us to have sequel connectors. So, going forward, you’ll be able to call InfluxDB instead of with plugs, you’ll be able to call it with — sorry, you’ll be able to query, not call. You’ll be able to query it with Sequel. The calling feature is part of Arrow Flight, though. I’ll go into it when we get into the project. But basically, these are the key technologies that the new engine is built on top of. And you can read a ton of blogs that go into — Paul especially loves to talk about why he built with all of these, and all of the features that they offer. So this is also our new architecture and deployment. So some of you guys probably haven’t seen the old version of this, so that’s okay. But basically, the idea is that now your data sources, it’s all timestamp data. So, it doesn’t matter, like I said, if it’s a metric, or an event, or a log, or a trace. We can accept it all, as long as it’s still that timestamp data. The data collection is pretty much the same, I would say. It’s mainly just Telegraph and the Client Library, so those are the big two. The Client Libraries are also currently being revamped, because not all of them currently can do Sequel Query. But going forward in the next couple of months, most of our top five will be able to, and I think, in the next year, almost all of them will be able to. Finally, data storage and transformation. So this has kind of changed a bit, we’ve focused a lot more on the collection and the storage, and then the sequel queries, which can’t do quite as much yet as what Capacitor used to offer. But we are planning to bring back certain features that Capacitor offered. And we’re in general building out a lot more documentation and working on the data fusion library. To allow our sequel queries to be able to do things like down-sampling and such. And finally, data visualization and analysis. So this one especially, we’ve been focusing a lot more on the integrations. So when I do visualizations in this project, I’ll be showing you on the Jager UI as well as Grafana. So we focused a little bit less on our own visualization library, because a lot of people just tend to use these outside tools. So, unlimited cardinality is the first, I suppose you could say, big piece here. So what cardinality means is that back in the day, you would have a trace or something. And a trace can be pretty large. It has approximately 32 to 100-plus tags you could quote on it, like data points on that value. So it would say things like where the trace came from, it might say even the server it was going to. It would just have a lot of information, basically. And the problem with our old DB is that we would kind of max out at about 30 or so tags before things started to get a little hairy, I suppose you could say. But now, we have unlimited cardinality, which allows — Adi, I’m calling it unlimited, but it’s really up to about 200 tags. Which is still a huge amount more, and will encompass most data. Like, it will deal with traces and logs perfectly fine, no problem whatsoever. So, we’ve solved this cardinality problem where now, you can have a much larger amount of fields and tags on your data source. So that means that you can have more data, basically, on it. You can have more things like the locations, or metadata, etc. So, that was a big piece, obviously. The Native SQL Support. So I will say that Sequel is not necessarily a requirement for dealing with open telemetry data. I’m just going to put that forward. But, Sequel is really nice in that a lot of people like to use Sequel. A lot of people — I don’t really like to say coding in it, but they do, for a lack of a better term, they’d be query with it, they code in it, they work with it a lot. And so, this allows a lot more people to be able to use this platform, and be able to query it back out.
Zoe Steinkamp: 00:11:52.407 And actually, a lot of integrations that we’re working on, things like Power BI and Tableau they expect the data to come back via a SQL. They expect to be able to query and SQL to get data out of DBs in general. So that’s another key thing that will help in the future is that now that you can query with SQL, you will be able to integrate with more features, more integration, more vendors. High-performance data ingestion, so this has always been kind of a case I would say actually, which is that we have always been pretty good at handling high rates and query loads. But it’s gotten even better, and we’re going to have within the next couple of months a lot of data coming out about this. But we’re showing off basically the capabilities of the new engine and where it — we’re going to basically do a comparison of it versus the old one, so we can kind of show where it shines and how it’s making things a lot faster. And in general, this is a good thing because, again, OpenTelemetry data tends to be a pretty high ingest rate. Like I said, traces are not always necessarily consistent metric, but logs and metrics are definitely very much — they’re noisy. You’re going to get them every single second kind of deal, and sometimes traces can be the same, especially on a very busy website. Seamless integration with absorbability tools, so this is also what I was kind of already talking about, which is the SQL allows us to integrate more. But we’re also working on more integration, so this one is one I’m already talking about, OpenTelemetry, and I would say that’s the biggest integration that we’re working with outside of pandas and a few other data-science tools that we’re also working on. But we also have integrations with Grafana as well as we work with tools like Jaeger, but we don’t necessarily have what I would consider a full integration. We don’t work with them as closely, but we do work with — it’s easy to work with the technology. So now I’m going to go ahead and get into the project. So first things first for those of you guys who want to grab it, and I would just grab it now. I’m trying to remember if it’s linked at the end of this, and I cannot recall, so I would grab this link now just to make life easy. But basically what this is — this is a GitHub project where we are storing OpenTelemetry data within InfluxDB. We’re running this on a Docker server, so and as I said before, HotROD basically allows you to create traces — fake, for a lack of a better term. It creates fake traces for us to use. I’ll kind of show how it works. And then after the OpenTelemetry data is up into InfluxDB, we go ahead and show it off in the Jaeger UI. We’ll also show it off in the Grafana UI as well. Parts of this can technically be replaced with Telegraf, and one thing to note is that we are currently in the process of getting our OpenTelemetry collector on the official OpenTelemetry collector list. We’ve been working on this for a while. It hasn’t necessarily been the most straightforward thing, but our engineers are working out, I suppose you could say, the minor kinks of this. And then they’re going to finally put it up on the OpenTelemetry project. So, as I mentioned before, HotROD basically creates, as it calls it, rides on demand. It’s creating traces and other such data so basically what you do, and we’ll see this when I actually pull the project up, but you basically click on these buttons at the top and it’s creating traces for you. And the idea here is that these are different websites, different services, so you can basically compare traces from multiple different sources.
Zoe Steinkamp: 00:15:36.801 So Jaeger is where you can actually start to see these traces. So Jaeger is a completely open-source library that allows you to visualize trace data, for lack of a better term, and it also gives your some pretty awesome features like these kind of trees and such that can tell where your traces are coming from, where you’re getting the most common amount, etc. I’m not going to go in depth on the Jaeger UI just because honestly, it’s just not my forte. But for those of you guys who want to check it out, obviously, this project is using it, and we will go over it briefly as well during the demo. But basically, Jaeger is a super awesome open source library for trace visualization. Grafana. So we also built our own Grafana dashboard for this project. So with this, you’ll be able to see, again, a lot of traces data. We also have a table or two, I think, dedicated to logs and metrics as well. But again, like I said before, this project does focus a little bit more heavy on the traces, but it’s not a super big deal to just start building out some Grafana dashboards about the logs and metrics as well. I’d say, if anything, that’s probably the very much straightforward thing to do. And we have a brand new plugin with Grafana, actually, which will allow you to query in SQL. So that’s a super fun and exciting thing. And in the docs for this project, we talked about how you actually hook up with Grafana using our new — we call it the fight SQL integration. And you’ll also be able to see your data within InfluxDB Cloud. So this is where you’re actually storing that data. So you go ahead and you — for those of you guys who haven’t actually checked out the new cloud product, you haven’t seen this before, basically what we’re doing here is we’re getting into our OTEL bucket, which by the way is in the project with the dashboard. Sorry, the database is called a bucket, and we called it OTEL. From there, we’re asking for the measurement of logs. You can see all the different measurement options that we have here. Again, this is all based on how open telemetry tends to like your data to be stored. And also, I think we’ve done a little bit of editing on our site as well. But basically, you can see those results here in this table. So you can see all the logs that are available. And this is a very small screenshot of this. But basically, these kind of — what’s the word? You can scroll right on this and you’ll be able to see more data. Again, when we actually get into the demo, it’ll kind of all wrap itself together. So this is a very wordy slide that basically goes over everything inside of the readme for the project. But I’m just going to kind of walk us through this. So obviously, you’ll need an InfluxDB Cloud account because that’s where InfluxDB 3 is currently living. You create two buckets, OTEL and OTEL Archival. Now one thing to note here is the OTEL Archival is optional. You don’t have to do that. Basically, the idea with that is to show a cold storage option. The one’s supposed to have a longer retention policy. Personally, I just do the OTEL bucket. You create an environment file with authentication credentials. You install Flight SQL as per the readme. So Flight SQL will be what allows you to query your data. You build and run the Docker images as per the readme. You import your dashboard with the JSON that we have provided inside the demo Grafana dashboards. So I’ve already done this in my project, but I’ll show you guys in the readme. And basically from there, you can create your fake traces by clicking on a customer on the Hot R.O.D. application. So Grafana setup details. So this is going to be best just right in the readme. But basically, when you set up Grafana, because you’re setting up a local host version of it, sorry, an open source, local host version of Grafana, basically, just make sure when you set your credentials, you make them nice and easy because these are not public. And also, when you go ahead, you’ll want to import this dashboard. And again, this is best explained within the readme, but basically what you’re going to be doing is you’re going to pick the Flight SQL integration that is now offered as well as Jaeger as well. You’re going to name it open telemetry, and then you’re going to need to upload that JSON file.
Zoe Steinkamp: 00:20:00.798 so let’s go into the demo side, and we can kind of take a better look at this. And while we’re going to do some live coding. All right. So give it a second here. It’s going to go ahead and get Docker up and running. Great. And you can see, obviously, we’ve got some Jaeger, some Grafana. Great. I’m going to move this Zoom thing out of my way. All right. So this is the project. Well, specifically, this is a version of the project that my coworker built up. We have the main one, which is fork from InfluxDB observability. But basically, what this does is, this is the project that, hopefully, you’re following. If not, now’s a great time to grab that link. But basically, here it talks about the credentials inside the environment file that you’re going to want to add. It’s pretty straight forward. It’s basically just your — I’ll show this, where you actually get this from. But basically, it’s within your URL inside of Cloud. Your token, your organization, your bucket, and, again, that archive bucket, if you want it. And then, finally, build the needed Docker images, which I’ve already done. That’s why I only had to run the Docker compose file. And then, finally, you’ll be able to see, traces are generated by Hot R.O.D. Browse to Hot R.O.D. at the localhost 8080. Query the traces here, and Grafana is available on localhost 3,000. So let’s go ahead and check this out. So I’ll probably have to refresh some of these, because I’m almost certain they’re, yeah, a little bit on the old side. All right. So we’re going to go ahead and get Rachel’s Floral Designs, and we’re going to refresh all of these pages. So let’s go ahead and find some traces here. So it’s found us two traces. Awesome. We’ll go back to Hot R.O.D. and maybe get one more. And we’ve got three traces. I guess maybe it actually did keep my old trace. That’s interesting. Let’s see. See if, on the front end, what we can find. Yeah. Like I said, this is not my, per se, forte. Let’s see here. That’s weird. I thought it would actually do it for me. Oh, there we go. But as you can see, we’re getting some traces on the front end. A lot of them are going, currently, through Redis, and a few through MySQL. So this is a nice little system architecture here. You can also compare traces as well, if you can actually find them in here. Let’s see. I don’t necessarily want to deal with grabbing trace IDs and such. But basically, this is how the Jaeger UI will look once it’s up and running. It’s pretty straight forward. For those of you guys who are a lot more comfortable with using this, I’m sure you guys know exactly how this would work and how you would want to find your data and such and sort through it. But let’s go ahead and load into Grafana. Okay. Let’s see here. Yeah. There’s my open telemetry one. Oh, dear. Oh. Oh. Looks like my data might be loading in a little bit. Give it a couple seconds. Looks like it might be having issues with the data source. Well, this is live coding for you guys. It was working 30 minutes ago. Of course, now it’s not as happy. But as you can see, we’ve got some of our traces down here. Yeah. I’m not quite sure why this is not finding its data. I’m going to refresh this page one more time. You never know. Okay. Maybe not. Sorry, guys. I don’t necessarily want to do a bunch of debugging while we’re looking at this. But normally, with this dashboard, you would be able to see all these wonderful graphs, as I showed in the screenshot. I am trying to figure out why they are not working.
Zoe Steinkamp: 00:24:06.188 Let’s see if we change it to Jaeger if it’s happier. Yeah. That did not help at all. Yeah. So we can see our service latency histogram and we can see the traces that we’ve already created. We’ll go ahead and create a few more traces while we’re at it because, yeah, I’m hoping that maybe eventually — let me go ahead and reload it. There. We’re getting more traces, but unfortunately, these are not loading in. I don’t know if maybe I need to reload in my dashboard since I’ve restarted this now, but obviously, my dashboard did reload it just fine. I’ll take a look at this. It’s probably me. It’s probably less to do with the project and probably more to do with me doing something wrong, possibly. And then we’ll go ahead and go to Cloud. I’m going to go ahead and log in here. Awesome. So in here, we should be able to view from my Otel buckets, from other data that I’ve got. Yeah. There we go. Go ahead and look at logs. And we can do things by fields like their attributes and their names but for right now, we’ll just go ahead and run with this. So as you can see I have my logs here, I have quite a few rows of them actually and I’m only getting it for the past hour since they’ve all been created in, roughly, the past five minutes or so. And as I was trying to say before, you can scroll this over a bit so you can see the trace ID that goes with the log. And this really can be visualized. You have the option of visualization here but for example — this doesn’t really work for visualization. We can see if calls total sum might work. Yeah. No. Most of these are — I would say that most open telemetry data is a little bit more — what’s the word here? Specific. And the visualization libraries you’re going to need to use for it. Things like Jaeger or Grafana are much better options than just the graphing that we offer. Our graphing is more to make sure that you actually have data inside your [DMD?], which I would say the table does a pretty good job of showing that, yes, clearly we have some data here. Yeah. It sucks that unfortunately, our OpenTelemetry is just not quite working as it was 20 minutes ago. Yeah. I think it wants to be on Grafana, I don’t know why it’s so upset. That’s okay. Yay. By me clicking on it, it worked, I didn’t do anything. See guys, this is magical. So, yeah, this is all of our services running. So you can see this scrolls quite a bit down because even in this past two minutes, we’ve got quite a bit of services. See if you just click around enough, you can fix everything. That’s how this works, right? Well, maybe not everything. Well now it’s just saying color field not found. I don’t even know what that means. What do you mean color field not found? I don’t know what this is. I do not think this is what I wanted but that’s okay. We’ll allow it. Let’s see if we can get this one fixed. That’s not helping. It’s funny in that I don’t even necessarily apply things and it still manages to fix itself. Not a lot of data found in response. I was hoping the traces of all things would work. Oh, well. I at least fixed two things by just screwing around with them. Like I said, I’m not going to — this is just part of live coding. Occasionally, things don’t quite work as expected but again, over here you will find that you can find the dashboard to import under this file here.
Zoe Steinkamp: 00:28:00.320 So it’s pretty straightforward. It’s just this demo folder. And then Grafana. And from there you’ll be able to find all of the charts to grab. You can already tell that Docker’s having an adverse effect on my computer. If you would like to access the trace node tree, then make sure it’s enabled with the Jaeger data source. Which as far as I know, I’ve done. The images are automatically built and pushed. Yes, that’s right. So these images, this Otel Collector — it’s automatically being built and pushed to Docker. You can check these two out if you want. But basically, they’re just talking about where the file is being held. And then this is talking more about the Otel Influx Collector. Which like I said, we’re currently working on making this a more, what’s the word, widely applicable I guess is the right word for this. Like I said, eventually this collector and receiver are both going to be available on the OpenTelemetry project. We’re just wrapping them up and putting a bow on them so they’re all ready to go. The Jaeger Query plugin for InfluxDB, which obviously enables querying traces is stored via the Jaeger UI. And if you want to go ahead and do some tests, you can also run those as well. So like I said before, this is a fork off this original project here. Done by one of our engineers, Jacob Marble, who has been working a lot on the Otel projects. You can also check this out as well. It builds the files a little bit differently, the way he’s doing his authentications and such. Oh, speaking of which, actually. Real quick. So here is where you get your URL. It’s this piece of the URL. So US East is where I’m currently working out of. If you want to get your Bucket ID, that is available right here. So that’s your Bucket ID. Really quick I’m going to see what the final piece was. I think it’s just the Token. Influx org. Okay, and Token. So your organization is this one right here. You can also find it within, let me see here. Let’s see. Settings. Orgs. So yeah, you can also find it in here. Sorry I’m looking for it. But basically, you can find it in there or just grab it out of the URL. And then, when it comes to the API Tokens, you’re going to want to generate them over here. Quick note, if you create an all-access Token, it gives somebody access to everything within the UI, or sorry, all of your Buckets. It just gives them access to absolutely everything. So we do warn against doing this. Otherwise, you can do a custom API Token. So with that, you’re going to want to, for example, the Otel 1, do make sure that when you do this, you give your Bucket both read and write permissions. Because otherwise we can’t write the data in, and you can’t get it back out to send it onward to Grafana or Jaeger or anything like that. So just make sure that when you create your Token you give it both access. And yeah, as far as creating Buckets, it’s pretty straightforward. You just go over here, and you give it a name, like Otel 2 or something. And then you put in your preferences for when it would be deleted. And then from there you just create it. So it’s very straightforward. So really quick, I’m going to turn off Docker, just because it seems to mess sometimes with my slides. Going to turn this project off. There we go. We’re going to put that down at the bottom. So learning resources. So this is a try-it-yourself. So the Influx community is where that project lives. That’s where both observability projects live. The one that Jacob created, and the one that my coworker Jay created, which allows that Grafana dashboarding. And then Influxdata.com is obviously our main website. It’s where you can sign up for the Cloud account, as well as just in general, find more information about us. The things we offer. What we’re used for, etc. Further resources here. So that would be things like getting started. Influxdata.com/cloud, always a good start. The community forums and Slack are both places where you can come to us for questions and answers. Our is — this is, again, the link to the Influx community. In general, on GitHub you can also find, obviously, all of our open-source, all of our other libraries, et cetera. Our book and documentation talk more about how to get stuff up and running, why we do what we do. Blogs are a great resource to see new features that are coming up as well as use cases and such. Blogs to talk about what our customers are doing and such. And finally, InfluxDB University is a great learn at your own pace resource that’s completely free. So you can take up a class there on something that you want to learn about and take your time doing it and it’s completely free. And that is the end of my presentation, so really quick I’m just going to put us on that QR code and then we can go ahead and take some questions. All right. There we go.
Caitlin Croft: 00:32:50.976 Perfect. And also, Zoe, I threw in that GitHub link into the Zoom chat. So if you guys don’t have your phone ready to scan it or you want it on your computer, the link is there. And also, I threw in a link to an upcoming webinar that I mentioned earlier. It’s with Gary, our Product Manager. And so you can come and learn all things InfluxDB 3.0. I’m sure you guys have lots of questions around that, so. All right. Now, let’s jump into the questions. Could OpenTelemetry be a replacement or an enhancement to my current Telegraf setup?
Zoe Steinkamp: 00:33:33.635 So, they are not per se replacements or enhancements. So, Telegraf itself is an open-source ingestion agent, which means it’s pretty wide as to the use cases it can be used for, versus the OpenTelemetry Collector that we’re currently working on is just for the OpenTelemetry project. So it focuses entirely on those traces, logs, and metrics. Now, you can get that data with Telegraf, kind of like how it says here parts of the collector could be replaced with Telegraf. They’re not technically a part of the — Telegraf is not a part of the official OTel Collector list and I don’t think it technically ever would be. Maybe one day it will. But — so, if you’re using Telegraf for something, the OpenTelemetry Collector just might be the replacement if you’re dealing with OpenTelemetry data, but it would probably never necessarily be an enhancement per se, if that makes sense. They’re just different use cases, for the most part.
Caitlin Croft: 00:34:26.469 Regarding cardinality, what was the previous limit? And what is the new limit for the same InfluxDB 3.0?
Zoe Steinkamp: 00:34:35.262 Well, in previous, what the limit was is you could have — basically, it was based off the way that we queried our data back out. So, normally you would have a timestamp, a value, a field, and a tag. So, fields would be — I think fields were the ones that you would run into problems, specifically if you had a bunch of fields, because we would query off those fields. So, if you had 30 plus fields attached to this value and this timestamp, the queries would be really slow because we were querying them on the back side, basically. We were pre-querying them to take into effect that you had all these. I think tags, you could have a lot more because we weren’t querying them in general. But that wasn’t helpful then if you actually needed to query off the data. It’s not helpful to throw all your data, basically, in a tag and be like, “Okay. Well, it’s all tagged but now I can’t search via those tags.” I needed this data in a field so I could actually search back. So I could be like, “I only want the location of my server in the Netherlands. I don’t want all the server information from Sweden” kind of deal. So now, with the new limits, it’s more like 200 rows. So now you can have 200 fields, basically, so. And I think nowadays, when we — I still am learning a little bit about how we redo our [schemas?] nowadays. But basically, it’s less separated between fields and tags, and more just, in general, timestamp, value, everything else. And that, every everything else category can have over 200 values, before things start to get a little hazy I suppose you could say. And even then I think you could push it, you could go above 200, I think it might just have some effect on the queryability — that speed.
Caitlin Croft: 00:36:17.043 Will IOx in their support for OpenTelemetry ever be added to InfluxDB OSS? We would like to use cloud but have certain dev and production development use cases which makes cloud unattainable in about 10% of our deployments.
Zoe Steinkamp: 00:36:36.487 So with that, we are currently as a company, as a whole I would say, we’re figuring out what we’re going to be doing as far as OSS with IOx. So with that I can’t really tell you unfortunately. I do believe that it will eventually reach open source honestly. But when it comes to support for OpenTelemetry, the OpenTelemetry collector itself will be open sourced because that’s what OpenTelemetry is in general, it’s all an open source project. Now, if the collector — and from what I understand, because the collector is just something that pushes data into Influx, it shouldn’t have an issue where you could still use it for the OSS. Now, don’t quote me on that I’d have to actually go and test it and determine that that’s the case. But in my mind it makes sense that that shouldn’t be an issue. Because the only issue that you currently might run into really between version three and the opensource [two?] is the fact that we have that sequel. And that’s more of a problem when you’re getting the data back out, less of a problem with the data in. And this OTel collector is going to be, for the most part, all about streaming the data in. Obviously, we do also have the OTel, what’s the word? [Psych?] receiver and the one that goes the opposite direction. And that’ll have to be kind of figured out basically, but unfortunately I don’t have a great answer for this right now. We should have an answer though for this question and in general the questions around open source in the next month or two I would say.
Caitlin Croft: 00:37:54.906 Yeah. I would definitely say keep an eye on our blogs and everything as we are continuing to roll out. We have a lot planned for the rest of this year as far as the rollout of all these new features, so. I know this sounds vague but stay tuned. And what about InfluxDB enterprise on-prem version, will it have OpenTelemetry support?
Zoe Steinkamp: 00:38:21.397 I have absolutely no idea. I want to say hopefully yes it will, but I know less about the on-prem plan for the engineering for on premise, unfortunately. I’m not quite sure what will be added from Influx IOx. I know there’s been definite talks about yes, that IOx will basically be integrated on top of enterprise to an extent. And so it will reap most of the benefits I guess you could say. And so I think that’s the hope, is that on-prem enterprise will still receive a lot of this great benefits that IOx has to offer, including things obviously like OpenTelemetry.
Caitlin Croft: 00:38:57.285 What is the protocol used by OpenTelemetry? Is it gRPC?
Zoe Steinkamp: 00:39:03.336 Yes, it is. Actually, I have to admit, I have a second laptop here and I looked this up because I saw this question. So it says here, “The specification defines OTL is implemented over gRPC and HTTP 1.1 transports.” You can go and check out their docs which are quite large about the protocol details. That would be my suggestion. But yes, it does appear to be built over gRPC.
Caitlin Croft: 00:39:29.235 What is the difference between AWS Timestream and InfluxDB?
Zoe Steinkamp: 00:39:35.189 So the big difference between us, and I do have to admit I don’t look at AWS Timestream super often, is that we do have a lot more, what’s the word here? Availability, options, features. There we go that’s the right word, features. We tend to have a little more features than Timestream. And because we are more focused on being more friendly I suppose you could say to the open source community and environment, we tend to hook up to things a little bit better we have a dedicated Grafana integration. We have dedicated integrations with lots of different open-source vendors and non-open source. Obviously, Grafana is a closed source too. And so that is one thing. And other thing is that we’re actively, obviously, working on this project. We wouldn’t have IOx here today if we weren’t actively working on things. And from what I understand, Timestream is a lot more of, for good or for worse, a consistent product. It’s not necessarily being actively improved or doing lots of fun, new things with it, but it stays consistent. That’s the best I can do because I haven’t looked at it recently.
Caitlin Croft: 00:40:36.803 And I will say this. Amazon, they have so many products under their suite, under their umbrella, that it’s always mind-boggling to me, personally, when I look at AWS and realize all the new products that they’ve come out with. So, obviously, I’m biased towards InfluxDB - so is Zoe - but we like to say that InfluxDB is purpose-built for time series data. That is all we do. All we do is work on our time series database and all the different components to the platform. So there is some — it takes a lot to build a database, and we have an entire engineering team working on it. And I will say this. There’s also other time series tools — other tools out there that you can use time series for, but they weren’t necessarily engineered for that. So it can’t handle the really high ingestion that is natural with time series data. Especially when you’re starting off. And also, we don’t have any external dependencies. There’s other time series tools out there that are built on top of Postgres and other platforms. So it can slow it down a little bit. And on that, someone just asked, how does InfluxDB compare with Prometheus? And Prometheus is great. I know a lot of people who love Prometheus as an observability or DevOps monitoring tool. However, it can’t scale out. From what I’ve seen, it’s really great for smaller-scale projects. Zoe, is there anything else you’d like to add to that question?
Zoe Steinkamp: 00:42:11.749 So the big thing with us versus Prometheus is we also offer that cloud offering. And from what I understand, Prometheus has an enterprise offering, which can scale out quite well, but I think that’s more of like an on-prem solution for them. And then they have their open source, but they don’t really yet have that middle in-between where there’s the cloud option.
Caitlin Croft: 00:42:32.009 Is it possible to use Flux with InfluxDB IOx, or do we need to switch to SQL?
Zoe Steinkamp: 00:42:38.266 So this is my fault, and I’m sorry, guys, for not saying this a little bit better. Here, let me go back here. So yes, InfluxDB IOx also supports InfluxQL and Flux. We are taking backwards compatibility very seriously. We’re actually currently working on InfluxQL with that backwards compatibility. Apparently, Flux wasn’t such a big deal to do it, but InfluxQL is proving a little trickier, but our engineers are currently working on it. So yes, you can still use Flux if that’s your preference. The only thing that we’re currently kind of working on is, if you’re already using Flux, don’t worry, don’t be afraid, don’t run away. But we are trying to have possibly newer users who rely a little bit more on SQL. But yes, if you’re using Flux, don’t worry about it. You can still use it.
Caitlin Croft: 00:43:21.467 Cool. So there’s another question around this, Zoe. Does OpenTelemetry replace Telegraph for network observability or monitoring?
Zoe Steinkamp: 00:43:33.243 Let me really — one sec, I’m looking something up. So the answer is possibly. If you’re currently using Telegraph to get your network observability monitoring data, and obviously, Telegraph has a lot of different options. You could be doing something like you’re getting it from — I’m trying to remember the one Telegraph one, but we have ones that connect to like AWS and such where you can get all of your of your cloud data, as well as a few other integrations that we offer. Yes, it might be a replacement, but it might not be. Because do remember that the OpenTelemetry collector and receiver and all that — that is within the OpenTelemetry project. And although I think it’s super great and the project’s super great, just remember, it does mean that you’re tying your horse to that OpenTelemetry project and you’re now within that ecosystem. And that might not be okay for your company, or — honestly, if you’ve already done the work and you like what Telegraf is doing for you — Telegraf is working perfectly fine, do not feel the need to go make yourself extra work if Telegraf’s doing what you need. This is meant for people who always wanted this connector, or are looking for this connector as they’re building out their observability — I’m going to call it platform — as they’re building up their observability solution, they might be able to use this collector. But if you’re perfectly content with what you have, don’t worry about it.
Caitlin Croft: 00:44:56.430 Can you talk a little bit more about the client library support for InfluxDB 3.0? We’ve already seen cases in the past where the client libraries for InfluxDB are moving to community-only support.
Caitlin Croft: 00:46:13.221 Will the InfluxDB ecosystem with Telegraf and Kapacitor be updated as a whole?
Zoe Steinkamp: 00:46:21.109 Telegraf, to be honest, lives in its own world. Telegraf is always being updated because it’s open source. And again, because — I mean, the output agents are definitely being updated, because they’re being updated for SQL or we’re just making new ones as well, because it’s one of those things where you can just keep creating output and input agents. Kapacitor is in a state of flux, I suppose you could say. And I have to admit, I don’t really know the answer for that one.
Caitlin Croft: 00:46:50.769 I think the team — if people have been following along with the company for a while, you know that we’ve been really putting a lot of our engineering time and effort into making InfluxDB Cloud even more robust, especially with the new storage engine, which we’ve called IOx, and everything else. So I don’t think Kapacitor has been worked on. It’s still there, but it just hasn’t been a focus of the team. How can —
Zoe Steinkamp: 00:47:18.792 Yeah, I would agree with that.
Caitlin Croft: 00:47:20.304 Oh, sorry. Sorry, Zoe. That being said, there’s so many other ways that you can do real-time alerting. I wouldn’t be too worried that — there’s plenty of other options out there. How can you monitor SNMP devices with OpenTelemetry?
Zoe Steinkamp: 00:47:42.554 Let me look this up really quick. I actually think we might have an SNMP Telegraf plugin, weirdly enough.
Caitlin Croft: 00:47:52.574 I think we do. Because I know it’s come up in the past.
Zoe Steinkamp: 00:47:59.321 Yeah. So currently OpenTelemetry doesn’t have anything in particular about SNMP. There are some people who are writing blogs about how to do it, though. It looks like OpenTelemetry has a receiver that you can use for this, so that would probably be your best bet. I’m really quick looking something up here.
Caitlin Croft: 00:48:24.186 I did throw in the SNMP agent protocol monitoring integration. So you get to make out —
Zoe Steinkamp: 00:48:31.418 That’s what I was about to say. Yeah. So that would be — that’s a Telegraf plugin, and that would be a great way to do this. It doesn’t really look like the OpenTelemetry project is really focused on SNMP, in particular, it looks like you can do it for sure because it’s agnostic technically, so it can fit anywhere. But if you want something that’s a little more specific, I would check out the Telegraf SNMP Agent that is available.
Caitlin Croft: 00:48:57.218 It’s a bit of a mouthful having the N and M next to each other. I always miss one of the letters.
Zoe Steinkamp: 00:49:02.819 Yeah. When I searched it, I missed the N, but it still came up with what I wanted. But I was like, “Oh, I think there’s an N missing here.”
Caitlin Croft: 00:49:10.207 It’s like neither of them are silent. Okay. How would I use OpenTelemetry — oh, sorry. How does OpenTelemetry use Telegraf on InfluxDB network while observability and monitoring functionalities?
Zoe Steinkamp: 00:49:32.205 So just to clarify here, OpenTelemetry doesn’t use Telegraf. They’re not the same. Like I said, you could see them as competing in the fact that they do similar things, but they neither one of them make any money, so they don’t totally compete up against anything. But they are not working together. The OpenTelemetry collector might remind you a lot of Telegraf collectors, to be honest, that’s kind of what it reminds me of, at least. But Telegraf is one, normally more specific about the — what’s the word here? The product, the agent, the thing that it’s — the fact that we have a SNMP agent in particular, that’s all it does. It just collects data from that device. Or when you’re monitoring your cloud stuff, it’s specific to AWS. It’s specific to GCP. It’s very specific in what it works with. Now, there’s 300 plus plug-ins, so the options are quite large, but the OTel collector is meant to work on anything. So that’s meant to work, whether you’re a front end chocolate shop or your AWS server infrastructure. It’s meant to work at a large scale, on a small scale, on a front end versus a back end. They actually have a lot of architecture diagrams based on whether or not you’re building out a data science platform or a front end shopping platform. So they’re a lot more agnostic in that way versus Telegraf which is a lot more specific for what the plugin’s going to do. But one thing they do share in common is they both define the type of data you’re going to get back. So Telegraph normally tells you all the data points you’re going to receive. And normally you can kind of edit it and say like, “I don’t want these things,” or, “I want these things,” or whatever. OpenTelemetry does a similar thing where it says, “This is what you’re going to get back for your Chase’s logs and metrics. This is the standard that we have set for you to receive this data.” So in that way, they are similar.
Caitlin Croft: 00:51:19.998 How does OTel handle NetFlow data?
Zoe Steinkamp: 00:51:24.607 Let’s see. Let’s see. Sorry, I’m looking it up on my other computer. Let’s see. What do we got?
Caitlin Croft: 00:51:44.650 I love it.
Zoe Steinkamp: 00:51:45.465 They don’t have anything — let’s see. They don’t have anything in particular for it. Again, I think because OpenTelemetry is so broad in its use, they don’t have anything in particular for NetFlow data [silence]
Zoe Steinkamp: 00:52:11.598 So yeah, so also, for those of you that don’t know, NetFlow stands for network flow data. So I mean, it’s kind of similar, I suppose, in the logs and traces. But I have to admit, I don’t think the OpenTelemetry project is necessarily for netflow data. I don’t want to say that though because, in all fairness, I don’t work for OpenTelemetry. I’m very familiar with the project; don’t get me wrong. But I’m not a part of their board meetings or anything, so I don’t know necessarily what they’re focused on, use case wise. But I don’t think that’s really what they would be for. But you could explore a little bit further. Maybe ask a question in their community, and they might be able to get back to you.
Caitlin Croft: 00:52:56.097 Awesome. Wow. So that was a lot of questions for Zoe. So if anyone has any other questions, we’ll just keep everything open here for another minute. Zoe, thank you for that presentation and handling all those questions like a champ. I know some people were already asking about the recording and the slides. They’ll be available by tomorrow morning. So what’s really nice is you can basically just go to the same page that you registered for the webinar, and by tomorrow morning it will change over and the recording and the slides will be made available. And once again, go bug Zoe in the slack channels, in the workspace. She’s in there. There’s a bunch of other developers on our team in there answering questions, so don’t be shy. And once again, I hope to see you guys on the InfluxDB 3.0 webinar that’s coming up in a few weeks with Gary, who’s our product manager. You can definitely bug him with lots of questions as well. Any last thoughts or comments, Zoe?
Zoe Steinkamp: 00:54:00.427 No. I think we’re good to go. And like she said, please feel free to reach out. And we have lots of other awesome webinars coming up.
Caitlin Croft: 00:54:08.662 Oh, and before we sign off, Zoe, do you want to plug your community office hours?
Zoe Steinkamp: 00:54:14.545 Oh, yeah. So tomorrow, we’re doing our community office hours. And tomorrow is going to be actually led by our head of product, Rick Spencer. He’s going to be talking about — I’m looking up the schedule right now. Let’s see what he’s going to be discussing. But yeah, he’s going to be doing office hours tomorrow. And office hours is an opportunity to learn about small topics and ask us questions live as we’re streamed on YouTube. You’ll see the link in our Slack community as well as — I think it’s on our website and our Reddit community. So we kind of post it all over as we can.
Caitlin Croft: 00:54:45.524 Awesome. Cool. Well, I hope to see all of you guys on a future virtual event or maybe even in-person. Thank you, everyone, and I hope you have a good day.
Developer Advocate, InfluxData
Zoe Steinkamp is a Developer Advocate for InfluxData. She has worked for InfluxData as a front end software engineer for over two years. Before InfluxData, she worked as a front end engineer for over 5 years in the original AngularJS. She originally went to a bootcamp for training in Python. Her favorite activities outside of work include traveling and gardening.