This is a beginner’s tutorial for how to write static data in batches to InfluxDB using these three methods:

  • Uploading data via Chronograf
  • Importing directly into InfluxDB
  • Using Telegraf and the Tail plugin

Before beginning, make sure you’ve experienced Time to Awesome™ and installed the TICK Stack. Make sure that the platform is running by executing:

brew services list

The repo for this tutorial is here. For the purposes of this tutorial, I used Bitcoin Historical Data, “BTC.csv” (33 MB), from kaggle from 2016-12-31 to 2018-06-17 with minute precision in Unix time (~760,000 points). The data looks like this:

Before starting any project in the TICK Stack, I highly recommend converting your timestamp data to nanosecond (ns) precision. You can specify the precision after writing your data, but some plugins don’t support every type of timestamp. Taking a second to convert your timestamp will simplify your life. To convert the timestamp from minute (m) precision to ns, I used this hacky Timestamp_Precision.py:

import pandas as pd
# import csv
df = pd.read_csv('data/BTC.csv')
df.head()
# convert minute precision to nanosecond precision
df["time"] = [str(df["time"][t]) + "000000000" for t in range(len(df))]
df.head()
# export as csv
ns_precision = df
ns_precision.to_csv('data/BTC_ns.csv', index=False)

1. Uploading via Chronograf

Chronograf has two requirements for uploading data. The file size must be no bigger than 25MB and written in line protocol. Since BTC.csv is 33MB, I created BTC_sm.csvBTC_sm.csv contains the first 100 rows of the BTC.csv. First, I converted the timestamps in BTC_sm.csv to ns unix time. Then I converted the data to line protocol in chronograf.txt. To convert BTC_sm_ns.csv to line protocol, I used csv_to_line.py:

import pandas as pd
#convert csv's to line protocol

#convert sample data to line protocol (with nanosecond precision)
df = pd.read_csv("data/BTC_sm_ns.csv")
lines = [“price”
         + ",type=BTC"
         + " "
         + "close=" + str(df["close"][d]) + ","
         + "high=" + str(df["high"][d]) + ","
         + "low=" + str(df["low"][d]) + ","
         + "open=" + str(df["open"][d]) + ","
         + "volume=" + str(df["volume"][d])
         + " " + str(df["time"][d]) for d in range(len(df))]
thefile = open('data/chronograf.txt', 'w')
for item in lines:
    thefile.write("%s\n" % item)

After this conversion, chronograf.txt looks like:

Next, navigate to http://localhost:8888/. Click on the “Data Explorer” tab on the left sidebar. Create a database with CREATE DATABASE chronograf_upload.

Next, click the “Write Data” icon. Now you can just drag and drop your data in. It’s that simple!

To see if your data successfully uploaded, you can run the following command:

SELECT * FROM “chronograf_upload”.“autogen”.“price” LIMIT 10

Click on the “Table” tab on the bottom right of the page to see your populated data. Make sure to check that your timestamps have been interpreted correctly. Now you’re ready to explore Chronograf and Dashboarding!

2. Importing directly into InfluxDB

If your data size is larger than 25 MB, I recommend using this method because it’s incredibly easy. To import directly into InfluxDB:

  • Create import.txt with the following text:
# DDL
CREATE DATABASE import

# DML
# CONTEXT-DATABASE: import

The DDL creates an autogen database named “import”. The DML specifies which database to use if you’ve already created a database.

  • Convert BTC_ns.csv to line protocol (using csv_to_line.py) and append the line protocol to import.txt. It looks like:

  • Start InfluxDB with influxd
  • Run the CLI cmd
influx -import -path=yourfilepath -precision=ns

  • To verify that you successfully imported your data, start an instance of influx and run:

Specifying the precision with precision rfc3339 converts the Unix timestamps to human-readable time.

3. Using Telegraf and the Tail plugin

The tail plugin is really meant to read streaming data. For example, you could make continuous requests to an api, append your new data to your line_protocol.txt file, and Telegraf with the Tail plugin will write the new points. However, you can change the read settings in the configuration file (conf) for the tail input so that you can read a static file. We’ll get into that in a bit.

First, make sure that the BTC_ns.csv file has been converted to line protocol, tail.txt. This txt file should be identical in format to the chronograf.txt, but with the full dataset (all 767,520 lines).

Next, you will need to create a telegraf.conf file. Navigate to the directory where you want this conf file to live. Next run this CLI cmd:

telegraf --input-filter tail --output-filter influxdb config > telegraf.conf

This cmd specifies the input and output filters that you want Telegraf to use. Feel free to name your conf something more descriptive too, like tail.conf, so you can remember what plugins you included in your conf later on.

Make these changes to the conf file:

  • Specify the precision of your data (line 65). For this example, our BTC data has been converted to ns precision, so: precision = "ns"
  • Since we’re not performing a monitoring task, we don’t care about setting a ‘host’ tag. Set omit_hostname = true
    so that Telegraf doesn’t set a ‘host’ tag.
  • Navigate to the OUTPUT PLUGIN section.
  • Specify server by uncommenting (line 92): urls = ["http://127.0.0.1:8086"] # required
  • Create a database (line 94): database = "tail" # required
  • Navigate to the SERVICES INPUT PLUGIN section.
  • Specify the absolute path for you line protocol txt file (line 210):
files = ["/Users/anaisdotis-georgiou/Desktop/Getting_Started_Telegraf/Data/tail.txt"]
  • Normally (with the Tail plugin), Telegraf doesn’t start writing to InfluxDB unless data has been appended to your .txt file because the Tail plugin’s primary purpose is to write streaming data to InfluxDB. This is determined by the from_beginning configuration which is normally set to false, so that you don’t write duplicate points. Since our data is static, change this configuration to true (line 212):
from_beginning = true
  • Specify the method of data digestion (last line):
data_format = "influx"
  • Navigate to the directory with your telegraf.conf. Run the following CLI cmd:
telegraf -config telegraf.conf

Start InfluxDB, create an instance, and verify that your new database “tail” exists and you successfully wrote the data:

I hope this tutorial helps get you started using Telegraf. If you have any questions, please post them on the community site or tweet us @InfluxDB. Thanks!

 

 

 

 

Contact Sales