Towards Data Science | Processing Time Series Data in Real-Time with InfluxDB and Structured Streaming

Navigate to:

Publication: Towards Data Science Title: Processing Time Series Data in Real-Time with InfluxDB and Structured Streaming Author: Vibhor Nigam

Abstract: In this article, published by Towards Data Science, Vibhor Nigam shows how to use open source InfluxDB with Spark-structured streaming to process, store and visualize data in real time. Here provides a detailed walkthrough for setting up a single node instance of InfluxDB and for extending the ForeachWriter of Spark to use it to write to InfluxDB. The author also discusses what developers should keep in mind while working with an InfluxDB database. In his conclusion, he writes: “I found InfluxDB to be highly efficient in data storage and very easy to use. The compaction algorithms of InfluxDB are very powerful and compress data to almost half of it. In my data itself, I have seen compression resulting in a reduction from around 67GB to 35GB.”

Read full coverage