Let's Compare: Benchmark Review of InfluxDB and Elasticsearch
In this webinar, Vlasta Hajek, Tomá Klapka, and Ivan Kudibal will compare the performance and features of InfluxDB and Elasticsearch for common time-series workloads, specifically looking at the rates of data ingestion, on-disk data compression, and query performance. Hear about how Ivan conducted his tests to determine which Time Series DB would best fit your needs. Fifteen minutes are reserved at the end of the talk to ask Ivan directly about his test processes and independent viewpoint.
Watch the Webinar
Watch the webinar “Let’s Compare: Benchmark Review of InfluxDB and Elasticsearch” by filling out the form and clicking on the download button on the right. This will open the recording.
[et_pb_toggle _builder_version="3.17.6" title="Transcript" title_font_size="26" border_width_all="0px" border_width_bottom="1px" module_class="transcript-toggle" closed_toggle_background_color="rgba(255,255,255,0)"]
Here is an unedited transcript of the webinar “Let’s Compare: Benchmark Review of InfluxDB and Elasticsearch.” This is provided for those who prefer to read than watch the webinar. Please note that the transcript is raw. We apologize for any transcribing errors.
Speakers: Chris Churilo: Director Product Marketing, InfluxData Ivan Kudibal: Co-founder and Engineering Manager, Bonitoo Vlasta Hajek: Software Developer, Bonitoo Tomá Klapka: DevOps Engineer, Bonitoo
Chris Churilo 00:00:00.714 All right. It’s three minutes after the hour. So thank you everybody for joining us today. Today we will be going over the benchmarking process as well as the results for InfluxDB 1.4.X versus Elasticsearch. I have some new friends from a company called Bonitoo in Prague, and they’ve been kind enough to help me with these benchmarks. And it’s always important to me to get help from an outside vendor because I really want to make sure that they can take a critical look at the tool, also a critical look at how we conduct these resources-and how we conduct these benchmarks. I’m sorry, a question popped up in the chat panel. And also, I want to be as fair as possible with all the vendors that we conduct the benchmarks against. I know benchmarks are always kind of funny, people feel that they are pretty biased and slanted towards the products that you’re trying to showcase, but we really are trying to make sure as best as we can that we conduct fair benchmarks so that you can feel confident that you are making the right choice. And in addition to that, Ivan will also remind everybody that the benchmarking tool is actually all open source, so everybody can take a look at that if you’re interested. So with that, I am going to hand it over to Ivan and remind everybody that this is being recorded. So if you have to step away, just let me know and we’ll make sure we get this to you. Okay Ivan, I’ll let you take it away.
Ivan Kudibal 00:01:46.604 Hello everybody. Welcome to this webinar. My name is Ivan Kudibal and I run this very small and new company called Bonitoo.io. We are an independent software company located in Prague. And together with me, there will be two more co-speakers, Vlasta Hajek and Tomas Klapka, both engineers who are engaged, have been engaged in benchmarking InfluxDB against the other databases. So we want to show how InfluxDB and Elasticsearch, we want to showcase both products in fair light and would like to also present unbiased comparison of the performance of both the products. Basically, what we did, we refreshed the benchmark efforts first conducted in 2016 by Robert Winslow. We are using the existing framework at the moment. This framework is a public available repository for DB comparisons, and during this presentation, we will show you how to use it so that you can try it yourselves. And well, the few words about Elasticsearch and InfluxDB and just the beginning. Well, why these two databases? Because well, Elasticsearch is part of the ELK stack used primarily as a storage for logs. Sometimes people use it also to store the time series and time purpose measurements. And well, InfluxDB is a Time Series Database. In general, it is designed to store and query generic time series-based measurements.
Ivan Kudibal 00:03:51.485 This webinar will be structured into four main parts. First, we would like to introduce you to the InfluxDB comparison framework. Then we would like to show a little bit of the InfluxDB comparisons from the command line interface point of view. In part three, we will present the benchmarking activities and reports. Finally, there will be conclusions that we were able to make about Elasticsearch and InfluxDB and we can open the Q&A. So in the part one, Vlasta will introduce you to the InfluxDB-comparisons framework.
Vlasta Hajek 00:05:05.062 Okay. Hi, everybody. I’m Vlasta and I will describe what is the methodology used for benchmarking and some details about framework. As Ivan told you, we were running the benchmark as they were run in 2016, now with the latest versions of InfluxDB and Elasticsearch. Completely, a year ago it was InfluxDB 1.0 and Elasticsearch 5.0, and now it will be InfluxDB 1.4 and Elasticsearch 5.6.3. I will briefly describe the methodology and the framework used to benchmark InfluxDB and Elasticsearch. But the methodology and the framework were originally designed and developed by Robert Winslow, and we have used them without any major modification. Robert already detailed and explained the framework and the distinct approach in his webinars a year ago and if you would like to know more details, I strongly advise you to watch those webinars. They are available on the InfluxData website. Okay. So we can spend hours with discussion about the best benchmarking methodology. Of course, there are still common properties that each approach must hold. Especially, it must be realistic. It must simulate real-world use case. The methodology must be fair and unbiased to all databases. And of course, it must be reproducible, so anyone can run the benchmarks and hopefully get the same result.
Vlasta Hajek 00:07:27.210 In our case, we have selected the real-world use case when there is DevOps engineer or manager maintaining a fleet of tens, or hundreds, maybe even thousands of computers watching on a dashboard, displaying metrics gathered from those servers. Metrics such as CPU, kernel info, memory, disk space, disk IO, network, some information about core processes like Nginx or PostgreSQL, or Redis. Those metrics are collected by agents running those servers. Let’s say, it could be Telegraf. And they are sent to a central DB. So with those metrics would be, for example, gathered each 10 seconds [inaudible] the Telegraf settings, I think. We can see our most continuous stream of the data going through the database. So measurements from 6 to 20 values, it means 11.2 values in average. And as there is 9 measurements, it gives us total 101 values written during each data collection. So it’s a pretty high number of values written in one shot. Each measurement has also 10 texts, where those text contains info of label location, and system info, and such as name, region, data center, what version, and so on. Those data are generated using pseudo-random generator, and using random Vlog algorithm to achieve a reproducible variance of data. And the data set can be generated for any length, for basic compression, shown here, we used data set with 100 hosts, and the values measured over the course of a day.
Vlasta Hajek 00:10:30.265 So what is actually measured in those benchmarks? For the time series use case, and also for other types of data storage, it’s definitely important to benchmark the input data rate. That is, how quickly can database ingest loads of the data, what gives us the view on how much data can the database handle at a time. We measure this in values per second, and it means the higher the value is, the better. It’s also important to know how the data are transferred to disk and what is the final disk close. How efficiently the database engine uses the provided disk space. Here the lesser value, the better. And of course, we would like to perform queries to read the data as fast as possible, to see them on our dashboard, and have the dashboard responsive as fast as possible without any lags. This we measure in time interval during which the query is processed on the server, and from this we calculate the possible number of queries which can be run in a second. And here we measured-we would like to get the highest query rate possible, so the higher value the better. As our use case is doing the time series data, where there is no updates or deletes, we don’t need any other measurements usually done for relational databases.
Vlasta Hajek 00:12:59.723 So how is the benchmarking framework used? It consists of a set of tools specific for each phase. At the first place, we will generate the data. The input data should be in a wire format specific for the database. For example, InfluxDB and Elasticsearch can have the data in plain text or JSON data. And this reduces the further overhead for the load tool, which can then just take the data prepared on the disk and transfer them on the wire to the database, to [inaudible], to some bulk end point as fast as is possible to achieve the best performance on the wire views, fasthttp library, which is considered for this purpose the best. And similarly, for queries, we generate query data in native format. The queries, the formula, and the search condition, and the data do feed the query benchmarking tools-are sent the same way as the bulk load data and using similar conditions. So the query benchmarking tool performs the query and sends the query definition to the server and measures the actual response time. And to achieve the maximum performance, it doesn’t validate any results. To validate the results, you can do this manually using a special option of the tool which can bring the response. It’s in pre-formatted type so you can compare the use from both databases and see how they match.
Vlasta Hajek 00:15:59.410 So here is the example of the low data for InfluxDB. InfluxDB uses so-called line protocol format, which is also nicely readable and writeable. It consists of four parts: the name of the measurement, which is then a composed series; the set of texts, which are key-values pairs and they are in the same time, same type, and they are indexed and ready to be used for querying and looping; then a sort of fields, which are typed data; and the latest time stamp in nanoseconds precisions. Here we can see an example of how such point looks like for memory measurement, network, and PostgreSQL. Elasticsearch is general purpose document storage. It is in the adjacent format, where we have collected form of data in key-value pairs. The data are general properties, and we use different index for different measurement but the same type.
Vlasta Hajek 00:18:14.201 Okay, so… Sorry.
Vlasta Hajek 00:18:45.834 Okay. Sorry. For measuring query performance, we use the query that would be typical example of source of data to develop some monitoring tools. In this case, the query selects the maximum CPU usage for given host over the course of a random hour interval grouped by a minute. InfluxDB uses InfluxQL language which is very similar to SQL. And those which are not familiar can see that it is nicely readable and writable. Cassandra is a JSON type engine. It’s also on DSL for querying. In this case, the actual query is written in JSON which is still human readable and writable, however, requires a little programmatic approach as you need to be very precise and know the exact piece. So now you’ve got the basic understanding of what the benchmarking framework is and what is the methodology behind it, and Tomas will show you the basic usage of framework.
Tomas Klapka 00:20:55.066 So firstly, you have to install the Golang binary distributions, which could be easily obtained from the Golang home page. As you can see here, you just click on “Download Go” and select the binary which suits best to your needs and to your operating system, and then install it according to installation instructions. Then there is one magical comment which is, “Go get,” which allows you to get the binary in one click. So now we will need for the hour, the demo scenario, we will need four tools. One is bulk data gen, the second one bulk load influx for loading the data sets into the database, and bulk query generator for generating queries, and query benchmarker which will form the actual benchmark and show us the results. At first, you will generate very simple data sets, which consists of 10 hosts, each point has 9 measurements. And also next one will be the query generator. So now we have those data sets in our directory. Now we can try the data ingestion, but this time I will perform just the dry run to actually check the read speed of my file system and of my storage. It should be higher than the actual ingestion rate.
Tomas Klapka 00:23:58.383 And now we will load the data set from the file system right to the database. It takes some time, and as a result, you can see the main point rate, which is here, and the main value rate. The value rate-it means that it’s the point range multiplied by nine. Okay. And then you will send some queries to the database from our second query data set.
Tomas Klapka 00:24:56.539 And it give us nice outputs. Here you can see that what is important here, it’s the mean rate which is seven milliseconds. And the second important value is the total time of all queries which it takes to-so I think that’s all for the very brief introduction to this InfluxDB-comparison framework. As you can see, it’s quite easy to use. And yes, you can tune every tool with various parameters. So for example, you can see the data generator which-the data generator consists of some parameters or arguments you can pass to and then change the actual content of the data set and so on. And the same, you can do with bulk load and bulk query generator. So thank you for watching. I hope that it will help you to dive into this comparison framework. And thanks a lot, you can continue.
Vlasta Hajek 00:27:09.461 So now you’ve seen how easy it is to use the benchmarking tools, so you can carry out those benchmarks if you would like yourself without special knowledge. It’s on this public repository, there is [inaudible] with me, and you can find the necessary information there as well. So now let’s go to the actual benchmarking and reports. So when we’ve run the benchmarks, we tried to be fair and unbiased. We used almost the default installation of both products. Small change we had to do because the Elasticsearch is memory-based and-java-based, sorry, and we need to set the proper memory limit for this process and also it’s recommended on the documentation available, so we set it to the half of the total memory of the host. In our case, we set it to 16 megabytes. There was no other important daemons on those running, except we had Telegraf for collecting the system metrics during the benchmark. But Telegraf is low resource process-it consumed only 5% of CPU. So as we didn’t change any process or database engine’s configuration, what has to be a little tuned was the actual templates for Elasticsearch. As Elasticsearch is general bulk-purpose recommended database, we’ve measured the input rate and credit performance for two index templates. First, what is called the default template, where there is disabled all field and no special defined indexing. The other use template is the aggregation template actually recommended to use for Elasticsearch as a Time Series Database, where there is source and all fields both disabled. And we’ve customized it to index the properties of the document which-the properties which are assigned to texts and other index time stamps. Other fields which hold the data are not indexed.
Vlasta Hajek 00:30:58.897 So what hardware did we use? In our benchmarking, we ran benchmark on two types of hosts: cloud-based, a true host, and on-premise bare metal machine. Our goal was to validate whether someone should be worried about the performance of cloud-based electro machine comparing to the bare metal machines when someone would like to follow today’s trend to go to cloud. We had available HP blades with Intel Xeon E5, type 2640, version 3, running at 2.6 GHz with 16 quartz and hyperthreading, so totally 32 by 2 cores. So we found that the only drawback on this machine was that it used SCSI drives, not the solid-state drives. In AWS, we choose the C4.4xlarge machines with similar parameters. But there wasn’t an exact match, so we had to choose the configuration with the faster CPU, and we choose nowadays best use EBS SSD drives.
Vlasta Hajek 00:32:53.568 So the result is that AWS is comparable, even it is-the results where the database runs on this machine, it’s little faster, taking advantage of the SSD and also the [inaudible] a higher CPU. So I guess the conclusion is that for the performance reason, there is no worry to prefer a bare metal over the two machines in cloud host. So what are the actual results you’ve got to when comparing InfluxDB 1.4.2 and Elasticsearch 5.6.3? It’s [inaudible] said that InfluxDB outperforms the Elasticsearch in two of the three metrics in the data ingestion. You can see that it outperforms by the order of magnitude. It’s also proved to be the best disk saver, 3 times better compression than Elasticsearch’s aggregation template and more than 20 times better than the default template. On the other hand, the Elasticsearch has higher query rate, so it outperformed InfluxDB in these metrics, but the result is not significant. It’s about 20%, not much difference. You can see also the difference between the Elasticsearch templates where the default template has a higher input rate, but consumes much more storage space and it’s slower in query responding. During the end of our benchmarking cycle, Elasticsearch 6.0 was also released, but we haven’t tested it yet. We will publish results later. Also, there will be measurements for cost deployments and different data sets, especially IOP.
Vlasta Hajek 00:36:14.549 So a small note to vertical scalability. What we noticed in this single deployment, Elasticsearch didn’t show much scalability. The input rate was more or less the same despite using different number of clients. We run from 1 to 32, and the input rate didn’t change much, neither the CPU allocation. On the other hand, InfluxDB showed quite a good vertical scalability, where the input rate grows with the number of clients, and also does the CPU allocation. But again, we use the default installation and we were comparing those. So we can imagine that with some detailed special tweaking, we can get slightly different results.
Ivan Kudibal 00:37:33.980 Thank you very much, Vlasta. So part four are the conclusions and Q&A. Well, one of the clear conclusions from this presentation based on the data that we have shown: we can clearly say that InfluxDB is the data ingestion winner, and best disk storage saver, as well as on the CPU usage with the growing number of workers was efficient. On the other hand, Elasticsearch is the fastest query responder. But still, InfluxDB for the queries is still performant. So what we can say and recommend to the guy who has responsibility to run the fleet of VMs and that he would be getting the data very close to the DevOps use case? We would recommend the TICK Stack clearly for the excellent performance, for high time to value, almost zero effort setup and maintenance cost, to the solution with less storage also can be good for your money. Finally, the scalability, at least the bigger the fleet, the less the average effort per diem you can get. And the key takeaways from this webinar before we run the Q&A. So after this webinar, you should be able to see the technical paper and blog post. For our detailed reports, visit the blog and download the technical paper. We will discuss a little bit more details, as well as we include the results that we presented today. Try InfluxDB-comparisons framework yourselves, and in case you have different results, whatever, or some issues, concerns, don’t hesitate. Contact us. Post issues at the InfluxDB-comparisons, or just send us an email, and we will be very glad to discuss and answer your concerns. So I would like to thank Tomas and Vlasta for their very great presentation and demo [laughter]. And let’s have a Q&A.
Chris Churilo 00:40:43.820 Awesome. So the good news is, one of the questions was answered by another attendee. So RB asked, “What version are you using?” The InfluxDB version was mentioned earlier but not Elasticsearch, and José was kind enough to say, “Elasticsearch version 5.6.3”, which is what we saw in the presentation as well. So thanks, José, for that. If you have any other questions, let’s see, please put them in the Q&A and the chat. And then, Peter, looks like you have your hand raised, so I’m going to go ahead and promote you so you can actually ask your question directly. Go ahead, Peter.
S1 00:41:47.870 No? Okay. Well, in the meantime, we’ll just wait. Okay. So here’s a great question from R&B, “I see old test values in the GitHub repo and the methodology in the README for ES hasn’t changed in over a year. Are these going to be updated?”
Ivan Kudibal 00:42:11.188 Well, I can say some of a short answer and Vlasta can complement with detailed answers. Yes and no. Well, why yes? Because we didn’t pay attention much to the README file. On the other hand, well, no means that we had to add new options to the code in order to get a better variability and flexibility of using the framework. So yes, we plan to improve the README.md over the course of the upcoming weeks. And maybe we’ll see a better version of InfluxDB-comparisons documentation when we have the other webinar sessions.
Chris Churilo 00:43:10.615 Cool. Thanks for seeing that miss out that we had, RB. Sounds like we will get to updating that documentation. You guys have any other questions or Vlasta, you want to comment on that further, please feel free. We’ll leave the lines open probably for the next five minutes.
Vlasta Hajek 00:43:27.561 Yes, I would just add that, as I said in the beginning, we didn’t change anything on the methodology. We didn’t see any reason for this. But as I said, that we will in the future, in the near future, we will enhance the use case from DevOps to IOP to add also another real-world use case and see how the databases can handle it. And we will extend the set of databases, including Prometheus, and other databases, Timescale, and I’m sure, remember others. But yes, there’s nothing that we would like to change in important parts of measuring or the actual data.
Chris Churilo 00:44:39.152 Thank you for that. And as I mentioned, we’ll just keep the lines open for few more minutes. If you have any other questions, feel free to post them. And William and Todd, I’ll reach out to you separately. And we’ll try to get these squared away for you, as far as the trial. And we will post these papers as quick as we can. The first one will be the Elasticsearch then the subsequent three databases that we talked about earlier. And as Vlasta mentioned, we are going to be continuing down the path and going after other technologies and other databases as well. So stay tuned for that. And if ever you have any feedback, you want us to check out some other technology just reach out to us and we’d be more than happy to accommodate. I think the guys at Bonitoo had a pretty fun time working with this tool and, I think they’re pretty anxious to continue doing these benchmarks for all of us.
Ivan Kudibal 00:45:39.607 Absolutely. Thank you, Chris [laughter].
Chris Churilo 00:45:52.242 Oh, RB, great question. So RB asked, “Thanks. Hey, I’ve seen that the Metricbeat for Elasticsearch uses documents with many more values. Does that affect the ingest performance?” Vlasta?
Vlasta Hajek 00:46:08.886 So we didn’t measure this effect of the number of fields yet. Used the fixed center of the fields. So the 10 texts, depending on the measurement, there was from 6 to about 25 fields depending if it was memory, CPU or, for example, Nginx. But in the same metric or same measurement, for example, the positive SQL metrics were the same fields, same texts for Elasticsearch and also for InfluxDB. So, hopefully, this answered your question.
S1 00:47:22.883 Great. Thanks, Vlasta and RB. As Vlasta mentioned and Ivan mentioned, we will also be continuing to modify the tool and just the way that we’re doing these comparisons. Always wanting to make sure that we can stress all these tools as much as possible and try to conduct the test so that it’s really fair across all these different technologies. So any feedback that you guys have like this or if you think that maybe we should try things a little differently, I think the guys would love to get that, so we can do it and take another stab at this.
S1 00:48:12.161 All right. I will keep the lines open for just a few more minutes, and then we will be busy fixing the videos, so we can post it as well as the paper and get that into the hands of everybody that participated today.
S1 00:49:08.368 All right. Looks like everyone’s signing off, so for those of you that are still on here, we’ll go ahead and close this out and I appreciate you joining us today. And I hope I get to see you guys on another one of our webinars soon. So thanks, everybody.
Vlasta Hajek 00:49:26.689 Thank you, Chris.
Tomas Klapka 00:49:29.177 Thank you.
Ivan Kudibal 00:49:34.129 Thank you and have a great day.