Webinar Recap: Saving the Holidays with Quix and InfluxDB: The Open Telemetry Anomaly Detection Story

Navigate to:

Just in time for your holiday viewing! Learn how to solve real-time time series processing challenges with Quix—the stream processing framework using Kafka and Python—and purpose-built time series database InfluxDB. “Saving the Holidays with Quix and InfluxDB: The Open Telemetry Anomaly Detection Story,” presented by Tun Shwe, VP of Data at Quix, and Jay Clifford, Developer Advocate for InfluxDB, also includes a live demo of a data pipeline that detects anomalies in a fictional factory setup.

Main topics covered

  • Introduction to Quix and InfluxDB
  • The concept of data plumbing and event streaming
  • Overview of time series data
  • Detecting anomalies in a factory setup
  • The concept of observability and open telemetry
  • Q&A session

Takeaways

Save the holidays with effective data management

The webinar illustrated that Quix and InfluxDB work in tandem to manage and analyze data effectively, even during the peak volumes and velocities associated with seasonal holiday demands. Quix aids and facilitates real-time data processing; InfluxDB enables long-term data storage and creates a comprehensive data management solution.

Quix is an open-source cloud-native library designed to process data in Kafka using Python. “Quix is a processor; it’s a transformer,” said Shwe. As such, Quix helps with the ingestion, transformation, and storage of machine data. Machine learning models help detect potential malfunctions, laying the foundation for a scalable data pipeline. InfluxDB is purpose-built for the high write and query loads associated with time series data and pipelines of this nature.

Open source technologies shape the future of data processing

Another point discussed was the importance of open source. InfluxDB and Quix are built on open source technologies and benefit from community contributions. InfluxDB 3.0’s architecture includes the open source technologies DataFusion, Apache Arrow, and Parquet. Together, they make up the FDAP Stack. Clifford explained, “[DataFusion and Apache Arrow] give us millisecond return times on your queries. And then we have Parquet as our open data format for storage.” Shwe emphasized Quix’s continued commitment to the Python developer community and their efforts to grow its community and open source contributions. Although there is no open source version of Quix Cloud at this time, Shwe said, “The more people we have shouting about it and sharing … the sooner we can work on that.”

They also highlighted the integration of other open source platforms, such as Grafana and HiveMQ, in their data pipeline. This integration demonstrated the interoperability and flexibility open source tools offer customized data solutions.

OpenTelemetry is a powerful tool

The webinar featured OpenTelemetry, an open source tool that aids in monitoring data pipelines by collecting traces, metrics, and logs from applications. Clifford explained that OpenTelemetry is more than just a standard—it’s a toolkit for orchestrating data collection, processing, and exportation. “OpenTelemetry is the unification,” he said, referring to how the tool brings logs, metrics, and traces together across various systems.

Using OpenTelemetry, developers can gain better visibility and control over their data pipelines, ensuring smooth operations and reliable data processing.

Insights surfaced

  • Used in conjunction, Quix and InfluxDB can build a real-time data processing pipeline.
  • Quix allows for real-time data ingestion and transformation, while InfluxDB provides a time series database for storing and querying data.
  • The pipeline demonstrated in the session uses an MQTT broker to feed machine data into Quix, which writes the data to InfluxDB. Then, the data is queried, a machine learning model processes it for anomaly detection, and the results are stored back in InfluxDB.
  • OpenTelemetry can monitor the data pipeline and provide insights into its performance.
  • The anomaly detection model used in the demo is an autoencoder neural network trained to learn a compressed representation of the input data.

Key quotes

  • “I’m driven to create a new normal where data is processed as soon as it is generated.” — Tun Shwe
  • “I’m driven to make observability and IoT solutions accessible to all.” — Jay Clifford
  • “What we’ve done … is we’ve taken all that data from those robot arms, and thanks to HiveMQ and through MQTT, we’ve been able to ingest,transform, and store that machine data.” — Tun Shwe
  • “We’ve deployed an initial machine learning model to detect these potential malfunctions using vibration data from the machines.” — Jay Clifford
  • “We’ve provided the foundations of a scalable data pipeline.” — Tun Shwe
  • “This is where Quix comes into its own. And I think where people really excel with Quix is where they see it as more than just a collection agent, but also something where you can build out a task engine, build out data enrichment.” — Jay Clifford

Watch Tun and Jay’s discussion and demo here: