Getting Started with Python and InfluxDB v2.0

By Anais Dotis-Georgiou / Product, Use Cases, Developer, Getting Started
Dec 17, 2019

Navigate to:

(Update: InfluxDB 3.0 moved away from Flux and a built-in task engine. Users can use external tools, like Python-based Quix, to create tasks in InfluxDB 3.0.)

With 200+ plugins, Telegraf has a wide variety of data collection applications. However, sometimes you need to collect custom data or maybe you want to integrate external tools into your time data analysis. In that case, it makes sense to take advantage of InfluxDB’s Client Libraries. Today we will focus on how to use the latest InfluxDB Python Client Library with InfluxDB v2.0. If you are running InfluxDB v1.x, please take a look at this tutorial instead.

InfluxDB Python Client Library

The InfluxDB Python Client has undergone some pretty big improvements since v1.x. It is faster - much faster - and easier to use. It supports multiprocessing and allows you to return your query as a Pandas DataFrame. The WriteAPI supports synchronous, asynchronous and batching writes into InfluxDB v2.0. The WriteAPI also supports 4 different write options. The QueryAPI also supports several query options, with the ability to interrupt a query stream.

Requirements

This tutorial was executed on a MacOS system with Python 3 installed via Homebrew and Conda for Python 3.6 and Python 3.7, respectively. Python 3.7 is required to execute multiprocessing. I recommend setting up additional tooling like virtualenv, pyenv, or conda-env to simplify Python and Client installations. Otherwise, the full requirements can be found here.

Installation

To install the InfluxDB Python Client Library, simply run:

pip install influxdb-client

If you already have the client installed, you can upgrade it with:

pip3 install --upgrade influxdb-client

Gathering Auth Parameters

In order to use the client, you’ll need to gather the following parameters:

Bucket Name or ID:

Follow this documentation to create a bucket. To view your buckets either use the UI or execute

influx -t <your-token> -o <your-org> bucket find

Token:

Follow this documentation to create a token. To view your tokens either use the UI or execute

influx auth find

Org:

To view your orgs either use the UI or execute

influx org find

Imports and connections

Import the Client:

from influxdb_client import InfluxDBClient

Establish a Connection:

client = InfluxDBClient(url="http://localhost:9999", token=token, org=org)

Writes

For this tutorial, we will explore all the ways we can write data related to the water levels of Coyote Creek. Our schema will look like this:

Bucket: “my-bucket” Measurement: “h2o_feet” Tag Key: “location” Tag Value: “coyote_creek” Field Key: “water_level” Field Value: 1

First, instantiate the WriteAPI:

write_api = client.write_api()

The WriteAPI also supports 4 different write options: Line Protocol String, Line Protocol Bytes, Data Point Structure, and Dictionary Style.

Write Option One - Line Protocol String:

Line protocol is the ingest format for Influx. It is useful when you already have data that is converted into line protocol (e.g. txt files).

write_api.write("my-bucket", "my-org", ["h2o_feet,location=coyote_creek water_level=1"])

Write Option Two - Line Protocol Bytes:

This is line protocol that is already encoded with UTF-8 as byte array. The client converts line protocol string into this format internally.

write_api.write("my-bucket", "my-org", ["h2o_feet,location=coyote_creek water_level=1".encode()])

Write Option Three - Data Point structure:

This structure is useful when constructing line protocol in client code. The Point() class ensures that your data is properly serialized into line protocol.

write_api.write("my-bucket", "my-org", Point("h2o_feet").tag("location", "coyote_creek").field("water_level", 1).time(1))

Write Option Four - Dictionary Style:

This is the most Python compatible way to represent data. Dictionaries are converted back into Point() structure internally.

write_api.write("my-bucket", "my-org", [{"measurement": "h2o_feet", "tags": {"location": "coyote_creek"}, "fields": {"water_level": 1}, "time": 1}])

Now that we know about all the ways that we can write to the database, let’s explore how we execute those writes. We have a few tuning options available to us. We can specify synchronous and asynchronous writes. We can also tune the batching of our writes.

Specify Write Options: Synchronous Writes:

First you’ll need to import your write_option method with:

from influxdb_client.client.write_api import SYNCHRONOUS

To specify a synchronous write, simply instantiate the WriteAPI with that option as the parameter.

write_api = client.write_api(write_options=SYNCHRONOUS)

Specify Write Options: Asynchronous Writes:

First you’ll need to import your write_option method with:

from influxdb_client.client.write_api import ASYNCHRONOUS

To specify an asynchronous write, simply instantiate the WriteAPI with that option as the parameter.

write_api = client.write_api(write_options=ASYNCHRONOUS)

Specify Write Options: Batching

The default instance of WriteApi uses batching. You can specify the following batch parameters:

batch_size: the number of data points to collect in a batch
flush_interval: the number of milliseconds before the batch is written
jitter_interval: the number of milliseconds to increase the batch flush interval by a random amount
retry_interval: the number of milliseconds to retry unsuccessful write. The retry interval is used when the InfluxDB server does not specify "Retry-After" header

Please look at the glossary to learn more about the functions of these parameters.

Example of batching options:

write_api = client.write_api(write_options=WriteOptions(batch_size=500, flush_interval=10_000, jitter_interval=2_000, retry_interval=5_000))

Querying the database

You have two options when it comes to querying Influx. You can either query using a table structure or query using a stream. You can also interrupt a stream after you’ve returned your required data. However, in order to perform a query we must first instantiate the QueryAPI.

First, instantiate the QueryAPI:

query_api = client.query_api()

Generate a Flux query:

You must query your data with Flux. If you’ve never written a Flux query, I recommend exploring the following resources:

Training Video: Introduction to Flux and Data Scripting
Getting Started Documentation
Using the Flux Query Builder in the UI
Flux Spec

A Flux query for the data point we just wrote would look like:

query = ' from(bucket:"my-bucket")\
|> range(start: -10m)\
|> filter(fn:(r) => r._measurement == "h2o_level")\
|> filter(fn: (r) => r.location == "coyote_creek")\
|> filter(fn:(r) => r._field == "water_level" )'

We can then use the table structure to return our data as:

Query Option One - Table Structure:

result = client.query_api().query(org=my-org, query=query)

results = []
for table in result:
    for record in table.records:
        results.append((record.get_value(), record.get_field()))

print(results)
[(Water_level, 1)]

The Flux Object has the following methods for accessing your data:

.get_measurement()
.get_field()
.values.get("<your tags>")
.get_time()
.get_start()
.get_stop()
.get_measurement()

Query Option Two - Stream

records = client.query_api.query_stream(org= "my-org", query=query)

for record in records:
    print(f'Temperature in {record["location"]} is {record["_value"]}')

Temperature in coyote_creek is 1.

Script for writing and querying points to InfluxDB with the Python Client

Now that we understand all of our options, let’s look at what a full script might look like to write and query points to InfluxDB with the Python Client. You can find the repo with this script here.

from influxdb_client import InfluxDBClient

org = "my-org"
bucket = "my-bucket"
token = $my-token
query = 'from(bucket: "my-bucket")\
|> range(start: -10m)\
|> filter(fn: (r) => r._measurement == "h2o_level")\
|> filter(fn: (r) => r._field == "water_level")\
|> filter(fn: (r) => r.location == "coyote_creek")'

#establish a connection
client = InfluxDBClient(url="http://localhost:9999", token=token, org=org)

#instantiate the WriteAPI and QueryAPI
write_api = client.write_api()
query_api = client.query_api()

#create and write the point
p = Point("h2o_level").tag("location", "coyote_creek").field("water_level", 1)
write_api.write(bucket=bucket,org=org,record=p)
#return the table and print the result
result = client.query_api().query(org=org, query=query)
results = []
for table in result:
    for record in table.records:
        results.append((record.get_value(), record.get_field()))
print(results)

I hope this tutorial helps you get started with Influx. As always, if you run into hurdles, please share them on our community site or Slack channel. We’d love to get your feedback and help you with any problems you run into.

Navigate to:

Try InfluxDB Cloud

Stop flying blind

Getting Started with Python and InfluxDB v2.0

By Anais Dotis-Georgiou / Product, Use Cases, Developer, Getting Started
Dec 17, 2019

Navigate to:

InfluxDB Python Client Library

Requirements

Installation

Gathering Auth Parameters

Imports and connections

Writes

Querying the database

Script for writing and querying points to InfluxDB with the Python Client

Try InfluxDB Cloud

Learn more about InfluxDB

Performance Benchmarking: InfluxDB 3.0 vs. InfluxDB Open Source

InfluxDB for Industrial IoT:  
A Live Demonstration

How Time Series Databases and Data Lakes Work Together

Data Warehousing

Network Monitoring

Time Series Data Analysis: Definitions and Best Techniques in 2024

Navigate to:

Try InfluxDB Cloud

Stop flying blind

Get Updates

Getting Started with Python and InfluxDB v2.0

By Anais Dotis-Georgiou / Product, Use Cases, Developer, Getting Started Dec 17, 2019

Navigate to:

InfluxDB Python Client Library

Requirements

Installation

Gathering Auth Parameters

Imports and connections

Writes

Querying the database

Script for writing and querying points to InfluxDB with the Python Client

Try InfluxDB Cloud

Learn more about InfluxDB

Performance Benchmarking: InfluxDB 3.0 vs. InfluxDB Open Source

InfluxDB for Industrial IoT: A Live Demonstration

How Time Series Databases and Data Lakes Work Together

Data Warehousing

Network Monitoring

Time Series Data Analysis: Definitions and Best Techniques in 2024

By Anais Dotis-Georgiou / Product, Use Cases, Developer, Getting Started
Dec 17, 2019

InfluxDB for Industrial IoT:  
A Live Demonstration