How to Overcome Memory Usage Challenges with the Time Series Index
This article is written by Saiyam Pathak, a software engineer of the large-scale multi-cloud Kubernetes project, who works on Kubernetes at Walmart Labs.
InfluxDB is a leading open source time series databases. In case you’re unfamiliar with InfluxDB, it is designed to be fast but it uses an in-memory index, which comes at the cost of RAM usage as your datasets grow. So, for optimum performance and RAM usage, InfluxData introduced a special indexing mechanism for InfluxDB called time series index (TSI). TSI optimizes the RAM usage saturation for larger data sets.
InfluxData supports customers using InfluxDB with tens of millions of time series data points. InfluxData’s goal, however, is to expand this capability to hundreds of millions, and eventually, billions of data points. This is why InfluxData has added the new TSI. The aim is to support a large number of time series (a very high cardinality in the number of unique time series that the database stores). Cardinality is a measurement of unique series in your database.
The Time Series Index (TSI) in context
With InfluxData’s TSI storage engine, users are able to have millions of unique time series. As the documentation states, the goal is that the number of series should be unaffected by the amount of memory the server hardware has. Importantly, the number of series that exist in the database should also ideally have a negligible impact on database startup time. The development of the TSI thus represents the most significant technical advancement in the database since InfluxData released the Time Series Merge Tree (TSM) storage engine in 2016.
As mentioned before, when InfluxDB ingests data, it stores not only the value but also indexes the measurement and tag information so that it can be queried quickly. In earlier versions, indexed data could only be stored in-memory and again, that requires a lot of RAM and places an upper bound on the number of series a machine can hold. The TSI was developed to allow it to surpass that upper bound. TSI stores index data on disk so that we are no longer restricted by RAM. TSI uses the operating system’s page cache to pull hot data into memory and let cold data rest on disk.
In this article, we focus on converting TSM storage engine in-memory indexing to time series index (TSI) indexing. This conversion improves performance and reduces memory problems. We will also provide some tips to reduce the memory overload for InfluxDB. Containers are used for this article.
Let’s begin by discussing the default installation of InfluxData’s open source platform, the TICK Stack (an acronym of the platform’s components Telegraf, InfluxDB, Chronograf and Kapacitor).
- Virtual machine with Docker installed: If you haven't installed Docker before, you can do that with official Docker documentation.
- TICK Stack installation: Before we explain the conversion from TSM to TSI indices, we first need to have the TICK Stack installed. The TICK Stack is an open source platform by InfluxData consisting of Telegraf, InfluxDB, Chronograf and Kapacitor.
Platform component definitions and installation
InfluxDB: InfluxDB is a time series database built from the ground up to handle high write and query loads. InfluxDB is meant to be used as a backing store for any use case involving large amounts of time-stamped data, including DevOps monitoring, application metrics, IoT sensor data and real-time analytics.
> docker network create influxdb
<figcaption> Docker network create</figcaption>
> docker run -d –name=influxdb -p 8086:8086 -v $PWD:/var/lib/influxdb –net=influxdb –restart=always influxdb
<figcaption> InfluxDB running</figcaption>
Telegraf: Telegraf is an open source agent written in Go for collecting metrics and data on the system it’s running on or from other services. Telegraf writes data that it collects to InfluxDB in the correct format.
telegraf.conf file to enable docker monitoring and change the influx endpoint.
> docker run --rm telegraf telegraf config > telegraf.conf
telegraf.conf file with the InfluxDB container name, “influxdb,” as the host for the URL update. For enabling the docker monitoring, uncomment the docker endpoint.
<figcaption> InfluxDB url update</figcaption>
<figcaption> Uncomment endpoint</figcaption>
Run the Telegraf container:
> docker run -d --name=telegraf --net=influxdb -v /var/run/docker.sock:/var/run/docker.sock -v $PWD/telegraf.conf:/etc/telegraf/telegraf.conf:ro telegraf
<figcaption> Telegraf running</figcaption>
Chronograf: Chronograf is InfluxData’s open source web application. Use Chronograf with the other components of the TICK Stack for alert management, data visualization and database management.
> docker run -d --name=chronograf -p 8000:8888 --net=influxdb --restart=always chronograf --influxdb-url
<figcaption> Chronograf running</figcaption>
Now that we have the TICK Stack running, how do we see if TSM or TSI is being used? To check whether the index version of Influx is in-memory or TSI, we can simply run the following command:
<figcaption> TICK Stack running</figcaption>
> docker logs influxdb | grep index_version
As you can see above, the “index_version=inmem” shows that InfluxDB does not have TSI enabled.
The five steps to enable TSI indices
There is a five-step procedure to enable TSI indices for InfluxDB container.
Step 1: Stop the container:
docker stop influxdb docker rm influxdb
<figcaption> Removing the container</figcaption>
Step 2: Start container again but with entrypoint as bash and pass the environment variable to enable TSI index version.
> sudo docker run -it --name influx-db --restart unless-stopped \ -e INFLUXDB_DATA_INDEX_VERSION="tsi1" \ -v $PWD:/var/lib/influxdb \ --entrypoint=bash -it \ -p 8086:8086 -p 8083:8083 \ influxdb
Step 3: Run the conversion from TSM to TSI.
> influx_inspect buildtsi -datadir=/var/lib/influxdb/data -waldir=/var/lib/influxdb/wal
<figcaption> TSM to TSI</figcaption>
You can also add
-database flag if you want to convert only for one database or one database at a time.
As you can see, all the shards have been indexed, and you can also see the index folder created:
find /var/lib/influxdb -type d -name index
<figcaption> The index folder created for both the databases</figcaption>
Step 4: Exit and remove the container.
<figcaption> Remove the container</figcaption>
Step 5: Start the InfluxDB container without entrypoint flag.
sudo docker run -itd --name influx-db --restart unless-stopped \ -e INFLUXDB_DATA_INDEX_VERSION="tsi1" \ -v $PWD:/var/lib/influxdb \ -p 8086:8086 -p 8083:8083 \ influxdb
<figcaption> TSI check</figcaption>
As you can see above, TSM is successfully changed to TSI.
Few important points regarding TSI:
- There is no change to the current data.
- While a shard is being indexed, a temporary file (
.index) is created. If for whatever reason this process crashes or fails, partials are removed and attempted again. Completed indices are left intact. This helps if the VM or container crashes while performing the conversion operation, preventing data from getting corrupted.
-databaseflag can be used to convert a specific database or one database at a time. If you do not specify that, then all the databases are converted.
-e INFLUXDB_DATA_INDEX_VERSION="tsi1?should be passed while starting up the InfluxDB container.
- Alternatively, conversion can be changed with the influxdb.conf file as well:
docker run –rm influxdb influxd config > influxdb.conf
- Edit the
influxdb.confand change the
index-version = "inmem" > index-version = "tsi1"
- Start the container with this conf file:
docker run -p 8086:8086 \ -v $PWD/influxdb.conf:/etc/influxdb/influxdb.conf:ro \ influxdb -config /etc/influxdb/influxdb.conf
- Indices still grow with the number of series, but memory isn't needed to grow linearly with those indices. This is where TSI helps.
- The general behavior to expect is that InfluxDB will use as much memory as is available to maintain an in-memory index and fall back to disk for anything else.
- Without TSI, there is only one index per database and it consumes all memory present and cannot do anything else. Now, after TSI, when the memory limit gets hit, InfluxDB starts referring to those indices. Also, it loads the WAL in-mem, indices are paged in as required.
If you face any issues or have any questions regarding TSM to TSI conversion, head over to InfluxDB’s Slack channel and start a discussion. TSI is breaking new ground and helping InfluxDB lead the way to over a billion series.