This is a beginner’s tutorial for how to write static data in batches to InfluxDB using these three methods:
- Uploading data via Chronograf
- Importing directly into InfluxDB
- Using Telegraf and the Tail plugin
brew services list
The repo for this tutorial is here. For the purposes of this tutorial, I used Bitcoin Historical Data, “BTC.csv” (33 MB), from kaggle from 2016-12-31 to 2018-06-17 with minute precision in Unix time (~760,000 points). The data looks like this:
Before starting any project in the TICK Stack, I highly recommend converting your timestamp data to nanosecond (ns) precision. You can specify the precision after writing your data, but some plugins don’t support every type of timestamp. Taking a second to convert your timestamp will simplify your life. To convert the timestamp from minute (m) precision to ns, I used this hacky
import pandas as pd # import csv df = pd.read_csv('data/BTC.csv') df.head() # convert minute precision to nanosecond precision df["time"] = [str(df["time"][t]) + "000000000" for t in range(len(df))] df.head() # export as csv ns_precision = df ns_precision.to_csv('data/BTC_ns.csv', index=False)
1. Uploading via Chronograf
Chronograf has two requirements for uploading data. The file size must be no bigger than 25MB and written in line protocol. Since BTC.csv is 33MB, I created
BTC_sm.csv contains the first 100 rows of the BTC.csv. First, I converted the timestamps in
BTC_sm.csv to ns unix time. Then I converted the data to line protocol in
chronograf.txt. To convert
BTC_sm_ns.csv to line protocol, I used
import pandas as pd #convert csv's to line protocol #convert sample data to line protocol (with nanosecond precision) df = pd.read_csv("data/BTC_sm_ns.csv") lines = [“price” + ",type=BTC" + " " + "close=" + str(df["close"][d]) + "," + "high=" + str(df["high"][d]) + "," + "low=" + str(df["low"][d]) + "," + "open=" + str(df["open"][d]) + "," + "volume=" + str(df["volume"][d]) + " " + str(df["time"][d]) for d in range(len(df))] thefile = open('data/chronograf.txt', 'w') for item in lines: thefile.write("%s\n" % item)
After this conversion,
chronograf.txt looks like:
Next, navigate to http://localhost:8888/. Click on the “Data Explorer” tab on the left sidebar. Create a database with
CREATE DATABASE chronograf_upload.
Next, click the “Write Data” icon. Now you can just drag and drop your data in. It’s that simple!
To see if your data successfully uploaded, you can run the following command:
SELECT * FROM “chronograf_upload”.“autogen”.“price” LIMIT 10
Click on the “Table” tab on the bottom right of the page to see your populated data. Make sure to check that your timestamps have been interpreted correctly. Now you’re ready to explore Chronograf and Dashboarding!
2. Importing directly into InfluxDB
If your data size is larger than 25 MB, I recommend using this method because it’s incredibly easy. To import directly into InfluxDB:
import.txtwith the following text:
# DDL CREATE DATABASE import # DML # CONTEXT-DATABASE: import
The DDL creates an autogen database named “import”. The DML specifies which database to use if you’ve already created a database.
BTC_ns.csvto line protocol (using
csv_to_line.py) and append the line protocol to
import.txt. It looks like:
- Start InfluxDB with
- Run the CLI cmd
influx -import -path=yourfilepath -precision=ns
- To verify that you successfully imported your data, start an instance of influx and run:
Specifying the precision with
precision rfc3339 converts the Unix timestamps to human-readable time.
3. Using Telegraf and the Tail plugin
The tail plugin is really meant to read streaming data. For example, you could make continuous requests to an api, append your new data to your
line_protocol.txt file, and Telegraf with the Tail plugin will write the new points. However, you can change the read settings in the configuration file (conf) for the tail input so that you can read a static file. We’ll get into that in a bit.
First, make sure that the
BTC_ns.csv file has been converted to line protocol,
tail.txt. This txt file should be identical in format to the
chronograf.txt, but with the full dataset (all 767,520 lines).
Next, you will need to create a
telegraf.conf file. Navigate to the directory where you want this conf file to live. Next run this CLI cmd:
telegraf --input-filter tail --output-filter influxdb config > telegraf.conf
This cmd specifies the input and output filters that you want Telegraf to use. Feel free to name your conf something more descriptive too, like
tail.conf, so you can remember what plugins you included in your conf later on.
Make these changes to the conf file:
- Specify the precision of your data (line 65). For this example, our BTC data has been converted to ns precision, so:
precision = "ns"
- Since we’re not performing a monitoring task, we don’t care about setting a ‘host’ tag. Set
omit_hostname = true
so that Telegraf doesn’t set a ‘host’ tag.
- Navigate to the OUTPUT PLUGIN section.
- Specify server by uncommenting (line 92):
urls = ["http://127.0.0.1:8086"] # required
- Create a database (line 94):
database = "tail" # required
- Navigate to the SERVICES INPUT PLUGIN section.
- Specify the absolute path for you line protocol txt file (line 210):
files = ["/Users/anaisdotis-georgiou/Desktop/Getting_Started_Telegraf/Data/tail.txt"]
- Normally (with the Tail plugin), Telegraf doesn’t start writing to InfluxDB unless data has been appended to your .txt file because the Tail plugin’s primary purpose is to write streaming data to InfluxDB. This is determined by the from_beginning configuration which is normally set to false, so that you don’t write duplicate points. Since our data is static, change this configuration to true (line 212):
from_beginning = true
- Specify the method of data digestion (last line):
data_format = "influx"
- Navigate to the directory with your
telegraf.conf. Run the following CLI cmd:
telegraf -config telegraf.conf
Start InfluxDB, create an instance, and verify that your new database “tail” exists and you successfully wrote the data:
I hope this tutorial helps get you started using Telegraf. If you have any questions, please post them on the community site or tweet us @InfluxDB. Thanks!