Backup/Restore of InfluxDB from/to Docker Containers

Navigate to:

One of the more exciting developments over the last few years is the emergence of containers, which allows software to be deployed in a securely isolated environment packaged with all its dependencies and libraries.

Docker has emerged as one of the leading container products in the market, and we’re seeing them used everywhere. Many times InfluxDB is monitoring metrics from containers, and the products in the TICK Stack (Telegraf, InfluxDB, Chronograf and Kapacitor) are frequently running within a container themselves. In fact, if you’d like to check this out, there is a previous blog article that walks you through setting up an InfluxData Sandbox, which not only runs in containers but also collects the metrics on your local system, the container environment and the InfluxDB database. This is a great way to get started with the product.

The Challenge

Inevitably, you may find yourself running InfluxDB in a Docker container. The thing people love about containers is that they are an isolated environment for running in. When you shut down the app you’re running, by design, the container you’re running in also shuts down. The challenge in these environments can be when you want to restore an InfluxDB database that’s running in a container. The issue is that in order to restore an InfluxDB database from backup, the instance needs to be stopped. And when you’re running InfluxDB in a container and you stop the database, the container shuts down. So, then how do you get around this catch-22 situation? That’s what I’ll be covering in this blog.

The Setup

Creating the Docker container running InfluxDB and loading some test data into it.

I’ll walk through my entire setup to make it easy for you to reproduce these steps in your lab. There is an example Dockerfile for InfluxDB out on GitHub to assist setting up an InfluxDB instance within a Docker container. I’ll be using that to set up my database. I’ve also created a sample dataset stocks.txt, which I’ll be using for this example.

I modified the Dockerfile located in the previously mentioned influxdata-docker repository under influxdb and added a few additional ports to expose and also a copy statement that copies the stocks.txt file located on my local server into the docker container. That way I’ll have some data to work with. Here is what my Dockerfile looks like:

FROM buildpack-deps:jessie-curl

RUN set -ex && \
    for key in \
        05CE15085FC09D18E99EFB22684A14CF2582E0C5 ; \
    do \
        gpg --keyserver ha.pool.sks-keyservers.net --recv-keys "$key" || \
        gpg --keyserver pgp.mit.edu --recv-keys "$key" || \
        gpg --keyserver keyserver.pgp.com --recv-keys "$key" ; \
    done

ENV INFLUXDB_VERSION 1.3.5
RUN wget -q https://dl.influxdata.com/influxdb/releases/influxdb_${INFLUXDB_VERSION}_amd64.deb.asc && \
    wget -q https://dl.influxdata.com/influxdb/releases/influxdb_${INFLUXDB_VERSION}_amd64.deb && \
    gpg --batch --verify influxdb_${INFLUXDB_VERSION}_amd64.deb.asc influxdb_${INFLUXDB_VERSION}_amd64.deb && \
    dpkg -i influxdb_${INFLUXDB_VERSION}_amd64.deb && \
    rm -f influxdb_${INFLUXDB_VERSION}_amd64.deb*
COPY influxdb.conf /etc/influxdb/influxdb.conf

EXPOSE 8086 8125/udp 8092/udp 8094

VOLUME /var/lib/influxdb

COPY stocks.txt /stocks.txt
COPY entrypoint.sh /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
CMD ["influxd"]

Working from the directory where your Dockerfile is located, do the following to build an InfluxDB container.

$ docker build -t test_influxdb .

Once complete you can list the docker images and you should now see the influxdb-container listed.

$ docker images
REPOSITORY    TAG     IMAGE ID      CREATED      SIZE
test_influxdb latest  7b7def77ff12  5 mins ago   224MB

Start the container and open a bash shell into it. Note: the two directories located on your local server are mapped to directories within the container. The first one is the InfluxDB data directory, and the second is where we will be backing up our data to.

$ export INFLUXDIR="$HOME/influxdb-test"
$ export BACKUPDIR="$HOME/backup-test"

$ CONTAINER_ID=$(docker run --rm \
  --detach \
  -v $INFLUXDIR:/var/lib/influxdb \
  -v $BACKUPDIR:/backups \
  -p 8086 \
  test_influxdb:latest
  )

$ docker exec –it "$CONTAINER_ID" /bin/bash

This should start a terminal session in the container where InfluxDB is running. You should see that the stocks.txt file which was copied into the container when we built it is in the root directory.

# ls –l stocks.txt
-rw-r--r-- 1 root root 3070 Jul 20 01:45 stocks.txt

Now let’s import it into the stocks database, which we’ll then backup.

# influx -import -path=stocks.txt -precision s

2017/09/25 18:58:36 Processed 1 commands
2017/09/25 18:58:36 Processed 38 inserts
2017/09/25 18:58:36 Failed 0 inserts

# influx -execute "select count(*) from stocks.autogen.stock_price"
name: stock_price
time count_high count_low count_open count_volume
---- ---------- --------- ---------- ------------
0    38         38        38        38

The Process

Now that we have an environment to work with, let’s start by backing up our containerized InfluxDB instance. This will backup to the directory we mapped when we were building our container $HOME/backup-test. Below are the steps we’ll follow to first backup our database, then drop the database and finally restore the database that was dropped.

  1. Capture the container id, image name and the port used to communicate with InfluxDB in our container.
  2. Backup InfluxDB to the backup directory defined above when the docker container was started.
  3. Drop the database from InfluxDB.
  4. Check to make sure the database is gone.
  5. Stop the docker container because InfluxDB must be stopped in order to run a restore.
  6. Run the restore command in an ephemeral container.
  7. Start the InfluxDB container.
  8. Query InfluxDB to show the database is there and the records have been restored.

The Details

  • First, capture the container id and the ephemeral port of the container.
$ CONTAINER_ID=`docker ps | grep test_influxdb | cut –c 1-12`
$ PORT=$(docker port "$CONTAINER_ID" 8086 | cut -d: -f2)
  • Second, backup the stocks database.
$ docker exec "$CONTAINER_ID" influx backup –database stocks "/backup/stocks.backup"
  • Run a SHOW DATABASES query, then DROP the database, then run the SHOW DATABASES query again to show it has been dropped.
$ curl http://localhost:${PORT}/query?q=SHOW+DATABASES
{"results":[{"statement_id":0,"series":[{"name":"databases","columns":["name"],"values":[["_internal"],["stocks"]]}]}]}

$ curl –XPOST http://localhost:${PORT}/query?q=DROP+DATABASE+stocks
{"results":[{"statement_id":0}]}

$ curl "http://localhost:${PORT}/query?q=SHOW+DATABASES"
{"results":[{"statement_id":0,"series":[{"name":"databases","columns":["name"],"values":[["_internal"]]}]}]}
  • Stop the docker container which will stop the InfluxDB database.
$ docker stop "$CONTAINER_ID"
  • Run the restore command in an ephemeral container. The below docker command affects the previously mounted volume mapped to /var/lib/influxdb.
$ docker run --rm \
  --entrypoint /bin/bash \
  -v $INFLUXDIR:/var/lib/influxdb \
  -v $BACKUPDIR:/backups \
  test_influxdb:latest \
  -c "influxd restore -metadir /var/lib/influxdb/meta -datadir /var/lib/influxdb/data -database stocks /backups/stocks.backup"

Using metastore snapshot: /backups/stocks.backup/meta.00
Restoring from backup /backups/stocks.backup/stocks.*
unpacking /var/lib/influxdb/data/stocks/autogen/3/000000001-000000001.tsm
unpacking /var/lib/influxdb/data/stocks/autogen/4/000000001-000000001.tsm
unpacking /var/lib/influxdb/data/stocks/autogen/5/000000001-000000001.tsm
unpacking /var/lib/influxdb/data/stocks/autogen/6/000000001-000000001.tsm
unpacking /var/lib/influxdb/data/stocks/autogen/7/000000001-000000001.tsm
unpacking /var/lib/influxdb/data/stocks/autogen/8/000000001-000000001.tsm
unpacking /var/lib/influxdb/data/stocks/autogen/9/000000001-000000001.tsm
unpacking /var/lib/influxdb/data/stocks/autogen/10/000000001-000000001.tsm
unpacking /var/lib/influxdb/data/stocks/autogen/11/000000001-000000001.tsm
  • Start the container in the background as we had previously and show the restored database.
$ CONTAINER_ID=$(docker run --rm \
  --detach \
  -v $INFLUXDIR:/var/lib/influxdb \
  -v $BACKUPDIR:/backups \
  -p 8086 \
  test_influxdb:latest
  )
$ PORT=$(docker port "$CONTAINER_ID" 8086 | cut -d: -f2)
$ curl -G "http://localhost:${PORT}/query?pretty=true"  --data-urlencode "q=SHOW DATABASES"
{
    "results": [
        {
            "statement_id": 0,
            "series": [
                {
                    "name": "databases",
                    "columns": [
                        "name"
                    ],
                    "values": [
                        [
                            "_internal"
                        ],
                        [
                            "stocks"
                        ]
                    ]
                }
            ]
        }
    ]
}
  • Let's also do a count to show that all the records have been restored.
$ curl -G "http://localhost:${PORT}/query?pretty=true" --data-urlencode "db=stocks" --data-urlencode "q=SELECT count(*) FROM \"stock_price\""

{
    "results": [
        {
            "statement_id": 0,
            "series": [
                {
                    "name": "stock_price",
                    "columns": [
                        "time",
                        "count_high",
                        "count_low",
                        "count_open",
                        "count_volume"
                    ],
                    "values": [
                        [
                            "1970-01-01T00:00:00Z",
                            38,
                            38,
                            38,
                            38
                        ]
                    ]
                }
            ]
        }
    ]
}

Next Steps

I hope this has been useful to you. If you’re new to running InfluxDB in a Docker container, there were plenty of useful links in the setup section above. The official InfluxDB docker repository is located here. Also, as mentioned previously, check out the InfluxData Sandbox. It’s a quick way to get started with the entire TICK Stack and very quickly start collecting and visualizing metrics from your local system, your InfluxDB instance and the docker environments that the TICK Stack is running in.

Acknowledgement

The above technique for restoring InfluxDB when it is running in a container was developed by Mark Rushakoff on the InfluxData engineering team.