<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <title>InfluxData Blog - Ed Bernier</title>
    <description>Posts by Ed Bernier on the InfluxData Blog</description>
    <link>https://www.influxdata.com/blog/author/ed-bernier/</link>
    <language>en-us</language>
    <lastBuildDate>Fri, 18 May 2018 06:00:35 -0700</lastBuildDate>
    <pubDate>Fri, 18 May 2018 06:00:35 -0700</pubDate>
    <ttl>1800</ttl>
    <item>
      <title>Boston Time Series Meetup at Wayfair</title>
      <description>&lt;p&gt;Last night was our second &lt;a href="https://www.meetup.com/Time-Series-boston/"&gt;Boston Time Series Meetup&lt;/a&gt; hosted by Wayfair.&lt;/p&gt;

&lt;p&gt;We had a turnout of around 45 people even with the severe thunderstorms that hit the Boston area a little before the event. &lt;a href="https://twitter.com/ryanbetts"&gt;Ryan Betts&lt;/a&gt;, Yaacov Ankori and I attended from InfluxData and a number of people from Wayfair attended as well. Wayfair provided a great space with 3 large screens and lots of seating room.&lt;/p&gt;

&lt;p&gt;The meetup started with Jim Hagan from Wayfair giving a talk entitled “The Four R’s of Metrics Delivery” which describes the Wayfair environment and the data pipeline they’ve built so far for metric collection, delivery, and use. Wayfair currently uses both Graphite and InfluxDB as a Time Series Platform and sends a diverse set of event trackers, timers, and other system metrics from over 2,000 VMs running hundreds of applications. Their systems are spread across three major data centers and the data is used by their developers, business stakeholders, and by our internal alerting engine. Most importantly their 24x7 Ops Monitoring Center is using this data to constantly analyze the vital signs of Wayfair’s IT infrastructure and storefront operations.&lt;/p&gt;

&lt;p&gt;At the 3 data centers, they use Kafka Mirror Maker to replicate the data to all 3 locations and they have (3) 6 node InfluxDB Enterprise clusters, which are dedicated to different workloads:&lt;/p&gt;
&lt;ul&gt;
 	&lt;li&gt;Store Front metrics&lt;/li&gt;
 	&lt;li&gt;General monitoring of things like Kafka queues, containers, etc.&lt;/li&gt;
 	&lt;li&gt;All other application monitoring&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Here is a copy of his presentation on &lt;a href="https://www.slideshare.net/influxdata/wayfair-use-case-the-four-rs-of-metrics-delivery"&gt;SlideShare&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Their InfluxData implementation is still a work-in-progress, and they are working towards building their next-generation pipeline which will take advantage of Kafka and the Telegraf streaming service to create a more robust data topology. Here is the &lt;a href="https://tech.wayfair.com/2018/04/time-series-data-at-wayfair/"&gt;blog post&lt;/a&gt; where Jim Hagan wrote in more detail on their environment. He spoke for about an hour, and the audience was really engaged and asked a lot of questions.&lt;/p&gt;

&lt;p&gt;Next, we had Ryan Betts from the InfluxDB team do a talk on InfluxDB Internals and how Time Series Databases are unique. It was an update to the talk that he gave at InfluxDays NYC 2018, and you can take a listen to the &lt;a href="https://www.youtube.com/watch?v=Vtcp-8MMVZ8"&gt;recording&lt;/a&gt; or download the &lt;a href="https://www.slideshare.net/influxdata/influxdb-internals"&gt;slides&lt;/a&gt;. Again lots of great feedback.&lt;/p&gt;

&lt;p&gt;Overall a good event. The next meetup will be held on &lt;a href="https://www.meetup.com/Time-Series-Boston/events/249366642/"&gt;July 17, 2018 at Wayfair&lt;/a&gt; with a talk from Ben Bianchi of Wayfair who will discuss an exciting new application using time series data to provide real-time intelligence to mission-critical algorithms. Also joining the meetup will be Jacob Lisi from &lt;a href="https://grafana.com/"&gt;Grafana&lt;/a&gt; who will discuss how to most effectively monitor and visualize your &lt;a href="https://kubernetes.io/"&gt;Kubernetes&lt;/a&gt; cluster using the Grafana Kubernetes plugin and PromQL.&lt;/p&gt;

&lt;p&gt;Hopefully, for the next meetup, the weather will cooperate. Hope to see you there!&lt;/p&gt;
</description>
      <pubDate>Fri, 18 May 2018 06:00:35 -0700</pubDate>
      <link>https://www.influxdata.com/blog/boston-time-series-meetup-at-wayfair/</link>
      <guid isPermaLink="true">https://www.influxdata.com/blog/boston-time-series-meetup-at-wayfair/</guid>
      <category>Use Cases</category>
      <author>Ed Bernier (InfluxData)</author>
    </item>
    <item>
      <title>The Effect of Cardinality on Data Ingest — Part 1</title>
      <description>&lt;p&gt;In my role as a Sales Engineer here at InfluxData, I get to talk to a lot of clients about how they’re using InfluxDB and the rest of the TICK Stack. We have a large number of very large clients using InfluxDB Enterprise for metrics collection, analysis, visualization and alerting in their DevOps area and so we’ve done a lot of scale out testing for these clients. In these tests, we see very linear scale out as we add additional nodes to an InfluxDB Enterprise Cluster. I’ll talk about this in my next blog post.&lt;/p&gt;

&lt;p&gt;Over the last 6 months, I’ve seen more and more large manufacturers, energy companies and utilities coming to us for collecting metrics from their IoT devices. Many times, they’re working with consulting companies that specialize in building IoT solutions, and these companies bring InfluxDB into the solution because we’re so well-suited for time series applications.&lt;/p&gt;

&lt;p&gt;A few things I’ve noticed with these IoT applications is that many times, there is a need for a local instance of InfluxDB running in the factory and alerting locally on anything they’re monitoring. In these cases, the equipment they have to run on is pretty lightweight, so it’s just as important to understand how we scale down as how we scale up. The other thing is the cardinality of the data can be rather large compared to the amount of data to be ingested. So I thought I’d do some scale down testing as well as measure the impact of cardinality on write throughput. That’s what this blog post is about. It’s the first of a series I’m doing on performance testing of InfluxDB Enterprise. So if you’re interested in this topic, stay tuned.&lt;/p&gt;
&lt;h3&gt;&lt;strong&gt;The Setup&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;For the purposes of this testing I’ll be spinning up a cluster in AWS using some utilities we’ve built to make this easy. If you haven’t worked with InfluxData’s TICK Stack before, you’ll be surprised how easy it is to install and setup. In fact, one of my peers, David Simmons, wrote another post on that topic you can find here &lt;a href="https://w2.influxdata.com/blog/zero-awesome-in-5-minutes/"&gt;Go from Zero to Awesome in 5 Minutes or Less&lt;/a&gt;. Check it out.&lt;/p&gt;

&lt;p&gt;For running InfluxDB on AWS, we’ve found that the R4 type of instances which are optimized for memory-intensive applications work best. These also use SSD storage which is recommended for your data, wal and hh directories when running InfluxDB or InfluxDB Enterprise.&lt;/p&gt;

&lt;p&gt;For the testing, I’ll be spinning up the following size clusters on AWS:&lt;/p&gt;
&lt;ol&gt;
 	&lt;li&gt;(2) nodes with 2 cores and 15.25 GB of memory (r4.large)&lt;/li&gt;
 	&lt;li&gt;(2) nodes with 4 cores and 30.5 GB of memory (r4.xlarge)&lt;/li&gt;
 	&lt;li&gt;(2) nodes with 8 cores and 61 GB of memory (r4.2xlarge)&lt;/li&gt;
 	&lt;li&gt;(2) nodes with 16 cores and 122 GB of memory (r4.4xlarge)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;And I’ll test these using data with the following cardinalities: 10,000, 100,000, and 1,000,000 series, to see how the number of unique series affects the ingest rate and the heap used.&lt;/p&gt;

&lt;p&gt;For Part Two of this series, I’ll also scale out to 4, 6, 8 and 10 node clusters and increase the cardinality to show how well InfluxDB Enterprise scales horizontally.&lt;/p&gt;

&lt;p&gt;To generate data for the testing with the correct granularity, I’ll be using a utility developed by one of our engineers called inch, which stands for &lt;strong&gt;IN&lt;/strong&gt;flux ben&lt;strong&gt;CH&lt;/strong&gt;marking. This is an awesome utility for simulating streaming data for benchmarking purposes. It’s written in Go and is available out on Github at &lt;a href="https://github.com/benbjohnson/inch"&gt;https://github.com/influxdata/inch&lt;/a&gt;. If you type &lt;code&gt;inch &amp;ndash;h&lt;/code&gt; you’ll get help on using the utility. I’ve listed the options below:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;Usage of inch:

-b int
Batch size (default 5000)

-c int
Concurrency (default 1)

-consistency string
Write consistency (default any) (default "any")

-db string
Database to write to (default "stress")

-delay duration
Delay between writes

-dry
Dry run (maximum writer perf of inch on box)

-f int
Fields per point (default 1)

-host string
Host (default "http://localhost:8086")

-m int
Measurements (default 1)

-max-errors int
Terminate process if this many errors encountered

-p int
Points per series (default 100)

-password string
Host Password

-report-host string
Host to send metrics

-report-password string
Password Host to send metrics

-report-tags string
Comma separated k=v tags to report alongside metrics

-report-user string
User for Host to send metrics

-shard-duration string
Set shard duration (default 7d) (default "7d")

-t string
Tag cardinality (default "10,10,10")

-target-latency duration
If set inch will attempt to adapt write delay to meet target

-time duration
Time span to spread writes over

-user string
Host User

-v
Verbose&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Using inch, I’ll generate data from two client nodes running on AWS m4.2xlarge nodes which have 8 cores each and 32 GB of memory. I’ll be running 8 streams on each client for a total of 16 concurrent writers.&lt;/p&gt;

&lt;p&gt;The difference in performance was minimal scaling up to 32 writers so I decided not to include the numbers.&lt;/p&gt;

&lt;p&gt;In summary, I’ll use the following constants for my testing:&lt;/p&gt;
&lt;ul&gt;
 	&lt;li&gt;(2) m4.2xlarge nodes running 8 write streams each&lt;/li&gt;
 	&lt;li&gt;Batch size for writes = 10,000&lt;/li&gt;
 	&lt;li&gt;Consistency = ANY&lt;/li&gt;
 	&lt;li&gt;Replication Factor = 2&lt;/li&gt;
 	&lt;li&gt;Number of points to write per series = 100000&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong&gt;The Testing&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;For this test I’m only using 2 node clusters which provide high availability, but since we are replicating writes across both nodes in the cluster I’m not testing scale out horizontally. In fact, due to cluster overhead, this performance would be slightly less than you’d expect on a single node of InfluxDB. Since most of our customers want high availability and InfluxDB provides a very high ingest rate even on smaller servers, this is a common configuration we see.&lt;/p&gt;

&lt;p&gt;After spinning up the cluster on AWS, the first thing I did was create my database with a replication factor of 2. I called my database “stress” and used the CLI to create it:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;influx -execute 'create database stress with replication 2'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Next, I logged into my client nodes and entered the following inch commands to start generating my workload for the 10,000 unique series test:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;inch -v -c 8 -b 10000 -t 1,5000,1 -p 100000 -consistency any
inch -v -c 8 -b 10000 -t 1,1,5000 -p 100000 -consistency any&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now let me explain the command line options for the above inch commands. The –v tells inch to print out detailed stats as it’s running, so I can see how many points have been written, the ingest rate and other details about the test. The –c tells inch how many write streams to run concurrently. I’m running 8 each so 16 concurrent write streams total. The –b allows me to set the batch size. A batch size 5,000 to 10,000 is recommended for InfluxDB so I chose 10,000. The –t allows me to define the shape of my data; in other words, the number of tags and how many unique values to generate for each tag. Client one generated 3 tags, the second one having 5000 unique values and client two generated 3 tags with the third one having 5000 unique values for a combined 10000 unique values overall. The –p indicates how many points to generate per series, and the –consistency option allows me to set my write consistency which I set to any.&lt;/p&gt;

&lt;p&gt;Here is a sample of what the generated data looks like:&lt;/p&gt;

&lt;p&gt;&lt;img class="alignnone size-full wp-image-209924" src="/images/legacy-uploads/inch-stats.png" alt="" width="396" height="247" /&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong&gt;The Details&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;So here are the results of my testing. As you can see, vertical scaling as I tested on systems with more cores was very linear.  Also as the cardinality was increased, it definitely had an impact on the ingestion rate, and I found that there is a performance hit as new series are being created for the first time but then once all the series are created, ingestion performance levels off to the rates you can see in the chart below.&lt;/p&gt;

&lt;p&gt;&lt;img class="alignnone size-full wp-image-209925" src="/images/legacy-uploads/ingest-rate-vs-series-cardinality.png" alt="" width="951" height="538" /&gt;&lt;/p&gt;
&lt;p class="p1"&gt;&lt;span style="font-family: 'Helvetica Neue',sans-serif;"&gt;I've also included the detailed numbers below:&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;img class="alignnone size-full wp-image-209926" src="/images/legacy-uploads/ingest-rate-values.png" alt="" width="645" height="255" /&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong&gt;Observations&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;I was pleasantly surprised by how much data a cluster with 2 core nodes could handle since many IoT use cases have minimal size servers at the edge of the network where there’s sometimes a need to have some local storage, visualization and alerting.&lt;/p&gt;

&lt;p&gt;I also was pleased to see how linearly the vertical scaling was as cores were added and as the cardinality of the data was increased. Also, the amount of memory needed as the cardinality was increased 10x from 100,000 to 1,000,000 also increased about 10x, which again was very predictable, which is good when doing capacity planning on your InfluxDB Enterprise environment.&lt;/p&gt;
&lt;h3&gt;&lt;strong&gt;What's Next?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Stay tuned for Part 2 where I’ll test horizontal cluster scale out.&lt;/p&gt;

&lt;p&gt;If you’d also like to see some comparison benchmarks of InfluxDB vs. &lt;a href="https://w2.influxdata.com/resources/benchmarking-influxdb-vs-opentsdb-for-time-series-data-metrics-and-management/?ao_campid=70137000000JgNv"&gt;OpenTSDB&lt;/a&gt;, &lt;a href="https://w2.influxdata.com/resources/benchmarking-influxdb-vs-elasticsearch-for-time-series/?ao_campid=70137000000JgNk"&gt;ElasticSearch&lt;/a&gt;, &lt;a href="https://w2.influxdata.com/resources/benchmarking-influxdb-vs-cassandra-for-time-series-data-metrics-and-management/?ao_campid=70137000000JXPi"&gt;Cassandra&lt;/a&gt; or &lt;a href="https://w2.influxdata.com/resources/benchmarking-influxdb-vs-mongodb-for-time-series-data-metrics-and-management/?ao_campid=70137000000JgNw"&gt;Mongo&lt;/a&gt;, check out these other benchmarks that have been run.&lt;/p&gt;
</description>
      <pubDate>Fri, 01 Dec 2017 12:05:11 -0700</pubDate>
      <link>https://www.influxdata.com/blog/the-effect-of-cardinality-on-data-ingest-part-1/</link>
      <guid isPermaLink="true">https://www.influxdata.com/blog/the-effect-of-cardinality-on-data-ingest-part-1/</guid>
      <category>Product</category>
      <category>Developer</category>
      <category>Company</category>
      <author>Ed Bernier (InfluxData)</author>
    </item>
    <item>
      <title>Backup/Restore of InfluxDB from/to Docker Containers</title>
      <description>&lt;p&gt;One of the more exciting developments over the last few years is the emergence of containers, which allows software to be deployed in a securely isolated environment packaged with all its dependencies and libraries.&lt;/p&gt;

&lt;p&gt;Docker has emerged as one of the leading container products in the market, and we’re seeing them used everywhere. Many times InfluxDB is monitoring metrics from containers, and the products in the TICK Stack (Telegraf, InfluxDB, Chronograf and Kapacitor) are frequently running within a container themselves. In fact, if you’d like to check this out, there is a previous blog article that walks you through setting up an &lt;a href="/blog/announcing-influxdata-sandbox/"&gt;InfluxData Sandbox&lt;/a&gt;, which not only runs in containers but also collects the metrics on your local system, the container environment and the InfluxDB database. This is a great way to get started with the product.&lt;/p&gt;

&lt;h2&gt;&lt;strong&gt;The Challenge&lt;/strong&gt;&lt;/h2&gt;

&lt;p&gt;Inevitably, you may find yourself running InfluxDB in a Docker container. The thing people love about containers is that they are an isolated environment for running in. When you shut down the app you’re running, by design, the container you’re running in also shuts down. The challenge in these environments can be when you want to restore an InfluxDB database that’s running in a container. The issue is that in order to restore an InfluxDB database from backup, the instance needs to be stopped. And when you’re running InfluxDB in a container and you stop the database, the container shuts down. So, then how do you get around this catch-22 situation? That’s what I’ll be covering in this blog.&lt;/p&gt;

&lt;h2&gt;The Setup&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Creating the Docker container running InfluxDB and loading some test data into it.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I’ll walk through my entire setup to make it easy for you to reproduce these steps in your lab. There is an example Dockerfile for InfluxDB out on &lt;a href="https://github.com/influxdata/influxdata-docker"&gt;GitHub&lt;/a&gt; to assist setting up an InfluxDB instance within a Docker container. I’ll be using that to set up my database. I’ve also created a sample dataset &lt;a href="https://github.com/edbernier/test_data"&gt;stocks.txt&lt;/a&gt;, which I’ll be using for this example.&lt;/p&gt;

&lt;p&gt;I modified the Dockerfile located in the previously mentioned influxdata-docker repository under influxdb and added a few additional ports to expose and also a copy statement that copies the stocks.txt file located on my local server into the docker container. That way I’ll have some data to work with. Here is what my Dockerfile looks like:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;FROM buildpack-deps:jessie-curl

RUN set -ex &amp;amp;&amp;amp; \
    for key in \
        05CE15085FC09D18E99EFB22684A14CF2582E0C5 ; \
    do \
        gpg --keyserver ha.pool.sks-keyservers.net --recv-keys "$key" || \
        gpg --keyserver pgp.mit.edu --recv-keys "$key" || \
        gpg --keyserver keyserver.pgp.com --recv-keys "$key" ; \
    done

ENV INFLUXDB_VERSION 1.3.5
RUN wget -q https://dl.influxdata.com/influxdb/releases/influxdb_${INFLUXDB_VERSION}_amd64.deb.asc &amp;amp;&amp;amp; \
    wget -q https://dl.influxdata.com/influxdb/releases/influxdb_${INFLUXDB_VERSION}_amd64.deb &amp;amp;&amp;amp; \
    gpg --batch --verify influxdb_${INFLUXDB_VERSION}_amd64.deb.asc influxdb_${INFLUXDB_VERSION}_amd64.deb &amp;amp;&amp;amp; \
    dpkg -i influxdb_${INFLUXDB_VERSION}_amd64.deb &amp;amp;&amp;amp; \
    rm -f influxdb_${INFLUXDB_VERSION}_amd64.deb*
COPY influxdb.conf /etc/influxdb/influxdb.conf

EXPOSE 8086 8125/udp 8092/udp 8094

VOLUME /var/lib/influxdb

COPY stocks.txt /stocks.txt
COPY entrypoint.sh /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
CMD ["influxd"]&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Working from the directory where your Dockerfile is located, do the following to build an InfluxDB container.&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;$ docker build -t test_influxdb .&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Once complete you can list the docker images and you should now see the influxdb-container listed.&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;$ docker images
REPOSITORY    TAG     IMAGE ID      CREATED      SIZE
test_influxdb latest  7b7def77ff12  5 mins ago   224MB&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Start the container and open a bash shell into it. &lt;em&gt;Note: the two directories located on your local server are mapped to directories within the container. The first one is the InfluxDB data directory, and the second is where we will be backing up our data to.&lt;/em&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;$ export INFLUXDIR="$HOME/influxdb-test"
$ export BACKUPDIR="$HOME/backup-test"

$ CONTAINER_ID=$(docker run --rm \
  --detach \
  -v $INFLUXDIR:/var/lib/influxdb \
  -v $BACKUPDIR:/backups \
  -p 8086 \
  test_influxdb:latest
  )

$ docker exec &amp;ndash;it "$CONTAINER_ID" /bin/bash&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This should start a terminal session in the container where InfluxDB is running. You should see that the stocks.txt file which was copied into the container when we built it is in the root directory.&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;# ls &amp;ndash;l stocks.txt
-rw-r--r-- 1 root root 3070 Jul 20 01:45 stocks.txt&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now let’s import it into the stocks database, which we’ll then backup.&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;# influx -import -path=stocks.txt -precision s

2017/09/25 18:58:36 Processed 1 commands
2017/09/25 18:58:36 Processed 38 inserts
2017/09/25 18:58:36 Failed 0 inserts

# influx -execute "select count(*) from stocks.autogen.stock_price"
name: stock_price
time count_high count_low count_open count_volume
---- ---------- --------- ---------- ------------
0    38         38        38        38&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;The Process&lt;/h2&gt;
&lt;p&gt;Now that we have an environment to work with, let’s start by backing up our containerized InfluxDB instance. This will backup to the directory we mapped when we were building our container &lt;code&gt;$HOME/backup-test&lt;/code&gt;. Below are the steps we’ll follow to first backup our database, then drop the database and finally restore the database that was dropped.&lt;/p&gt;
&lt;ol&gt;
 	&lt;li&gt;Capture the container id, image name and the port used to communicate with InfluxDB in our container.&lt;/li&gt;
 	&lt;li&gt;Backup InfluxDB to the backup directory defined above when the docker container was started.&lt;/li&gt;
 	&lt;li&gt;Drop the database from InfluxDB.&lt;/li&gt;
 	&lt;li&gt;Check to make sure the database is gone.&lt;/li&gt;
 	&lt;li&gt;Stop the docker container because InfluxDB must be stopped in order to run a restore.&lt;/li&gt;
 	&lt;li&gt;Run the restore command in an ephemeral container.&lt;/li&gt;
 	&lt;li&gt;Start the InfluxDB container.&lt;/li&gt;
 	&lt;li&gt;Query InfluxDB to show the database is there and the records have been restored.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;The Details&lt;/h2&gt;
&lt;ul&gt;
 	&lt;li&gt;First, capture the container id and the ephemeral port of the container.&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;$ CONTAINER_ID=`docker ps | grep test_influxdb | cut &amp;ndash;c 1-12`
$ PORT=$(docker port "$CONTAINER_ID" 8086 | cut -d: -f2)&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
 	&lt;li&gt;Second, backup the stocks database.&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;$ docker exec "$CONTAINER_ID" influx backup &amp;ndash;database stocks "/backup/stocks.backup"&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
 	&lt;li&gt;Run a &lt;code&gt;SHOW DATABASES&lt;/code&gt; query, then &lt;code&gt;DROP&lt;/code&gt; the database, then run the &lt;code&gt;SHOW DATABASES&lt;/code&gt; query again to show it has been dropped.&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;$ curl http://localhost:${PORT}/query?q=SHOW+DATABASES
{"results":[{"statement_id":0,"series":[{"name":"databases","columns":["name"],"values":[["_internal"],["stocks"]]}]}]}

$ curl &amp;ndash;XPOST http://localhost:${PORT}/query?q=DROP+DATABASE+stocks
{"results":[{"statement_id":0}]}

$ curl "http://localhost:${PORT}/query?q=SHOW+DATABASES"
{"results":[{"statement_id":0,"series":[{"name":"databases","columns":["name"],"values":[["_internal"]]}]}]}&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
 	&lt;li&gt;Stop the docker container which will stop the InfluxDB database.&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;$ docker stop "$CONTAINER_ID"&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
 	&lt;li&gt;Run the restore command in an ephemeral container. The below docker command affects the previously mounted volume mapped to /var/lib/influxdb.&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;$ docker run --rm \
  --entrypoint /bin/bash \
  -v $INFLUXDIR:/var/lib/influxdb \
  -v $BACKUPDIR:/backups \
  test_influxdb:latest \
  -c "influxd restore -metadir /var/lib/influxdb/meta -datadir /var/lib/influxdb/data -database stocks /backups/stocks.backup"

Using metastore snapshot: /backups/stocks.backup/meta.00
Restoring from backup /backups/stocks.backup/stocks.*
unpacking /var/lib/influxdb/data/stocks/autogen/3/000000001-000000001.tsm
unpacking /var/lib/influxdb/data/stocks/autogen/4/000000001-000000001.tsm
unpacking /var/lib/influxdb/data/stocks/autogen/5/000000001-000000001.tsm
unpacking /var/lib/influxdb/data/stocks/autogen/6/000000001-000000001.tsm
unpacking /var/lib/influxdb/data/stocks/autogen/7/000000001-000000001.tsm
unpacking /var/lib/influxdb/data/stocks/autogen/8/000000001-000000001.tsm
unpacking /var/lib/influxdb/data/stocks/autogen/9/000000001-000000001.tsm
unpacking /var/lib/influxdb/data/stocks/autogen/10/000000001-000000001.tsm
unpacking /var/lib/influxdb/data/stocks/autogen/11/000000001-000000001.tsm&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
 	&lt;li&gt;Start the container in the background as we had previously and show the restored database.&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;$ CONTAINER_ID=$(docker run --rm \
  --detach \
  -v $INFLUXDIR:/var/lib/influxdb \
  -v $BACKUPDIR:/backups \
  -p 8086 \
  test_influxdb:latest
  )
$ PORT=$(docker port "$CONTAINER_ID" 8086 | cut -d: -f2)
$ curl -G "http://localhost:${PORT}/query?pretty=true"  --data-urlencode "q=SHOW DATABASES"
{
    "results": [
        {
            "statement_id": 0,
            "series": [
                {
                    "name": "databases",
                    "columns": [
                        "name"
                    ],
                    "values": [
                        [
                            "_internal"
                        ],
                        [
                            "stocks"
                        ]
                    ]
                }
            ]
        }
    ]
}&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
 	&lt;li&gt;Let's also do a count to show that all the records have been restored.&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code class="language-markup"&gt;$ curl -G "http://localhost:${PORT}/query?pretty=true" --data-urlencode "db=stocks" --data-urlencode "q=SELECT count(*) FROM \"stock_price\""

{
    "results": [
        {
            "statement_id": 0,
            "series": [
                {
                    "name": "stock_price",
                    "columns": [
                        "time",
                        "count_high",
                        "count_low",
                        "count_open",
                        "count_volume"
                    ],
                    "values": [
                        [
                            "1970-01-01T00:00:00Z",
                            38,
                            38,
                            38,
                            38
                        ]
                    ]
                }
            ]
        }
    ]
}&lt;/code&gt;&lt;/pre&gt;

&lt;h3&gt;Next Steps&lt;/h3&gt;

&lt;p&gt;I hope this has been useful to you. If you’re new to running InfluxDB in a Docker container, there were plenty of useful links in the setup section above. The official InfluxDB docker repository is located &lt;a href="https://hub.docker.com/_/influxdb/"&gt;here&lt;/a&gt;. Also, as mentioned previously, check out the &lt;a href="https://github.com/influxdata/sandbox"&gt;InfluxData Sandbox&lt;/a&gt;. It’s a quick way to get started with the entire TICK Stack and very quickly start collecting and visualizing metrics from your local system, your InfluxDB instance and the docker environments that the TICK Stack is running in.&lt;/p&gt;

&lt;h3&gt;Acknowledgement&lt;/h3&gt;

&lt;p&gt;The above technique for restoring InfluxDB when it is running in a container was developed by &lt;a href="https://github.com/mark-rushakoff"&gt;Mark Rushakoff &lt;/a&gt;on the InfluxData engineering team.&lt;/p&gt;
</description>
      <pubDate>Wed, 27 Sep 2017 04:00:10 -0700</pubDate>
      <link>https://www.influxdata.com/blog/backuprestore-of-influxdb-fromto-docker-containers/</link>
      <guid isPermaLink="true">https://www.influxdata.com/blog/backuprestore-of-influxdb-fromto-docker-containers/</guid>
      <category>Product</category>
      <category>Use Cases</category>
      <category>Developer</category>
      <author>Ed Bernier (InfluxData)</author>
    </item>
  </channel>
</rss>
