Use Cases for Time-Series Data

OpenStack is a cloud operating system that controls large pools of compute, storage, and networking resources throughout a datacenter, all managed through a dashboard that gives administrators control while empowering their users to provision resources through a web interface.

There are many open source projects which power OpenStack and core services namely Swift (object storage), Keystone (Identity), Nova (compute), Neutron (networking), Cinder (block storage) and Glance (image service). OpenStack has quickly evolved as a platform of choice for many enterprises building their private or hybrid clouds even with the dominance of public clouds.

Why are companies building OpenStack solutions?

There are many organizations that don’t want to lock themselves into a single vendor, regardless of how good or bad this vendor is. Or, there are still many cases in which controlling the infrastructure will give you a business advantage in controlling your product margins. That is, using specialized infrastructure that is more optimized for the kind of workload and customers that you are serving, than just general purpose infrastructure.
Having said that, the most important point is that OpenStack is not a product and shouldn’t be measured as such. It’s an ecosystem with strong foundations behind it.

When there is a common baseline of infrastructure, it’s much easier for an entire industry to plug into it and support it individually. This applies to all of the layers of the stack starting from storage, and network through more high level services such as big data services, and even analytics.

In this context, the big cost saving is that all the major infrastructure providers have been adding support for OpenStack, which drives the cost down for three reasons:

  • The complexity of supporting a common infrastructure is lower by definition, than supporting diverse  infrastructures that don’t have any common ground.
  • OpenStack has become a marketplace for all the data center providers increasing the competition and in this way, driving down the cost of each provider.
  • This reduces the barrier of entry for more players, and as a result we’re starting to see more startups providing new products and offering the ability to plug into OpenStack.

Challenges in building OpenStack solutions

Even though there is wide adoption, maturity of OpenStack is often up for discussion as it is primarily being powered by open source projects. Key challenges in building OpenStack management solutions include:

  • The need for cross domain technical expertise
  • Robust, enterprise-grade open source software is hard to maintain without support from multiple vendors
  • Because it’s very much middleware driven, most solutions have to be custom integrated
  • Steep learning curve

get-started__graphic-2
Due to these challenges, a solution with full visibility and control into various layers of OpenStack is what every enterprise adopter wants to build. However, the ability to collect metrics data from many distributed components can be a challenge without a big ecosystem. Scaling the backend to “datacenter level” requires very high concurrency and storage in range of billions of data points.

InfluxData for OpenStack

InfluxDB has become synonymous with OpenStack deployments with both major platform and services vendors using it as integral part of their OpenStack solutions. This is because of targeted features like:

  • Telegraf – 40+ metric collection plugins for applications (Apache, Memcache, Redis, HAProxy, Nginx, etc) , databases (Postgres, mysql, cassandra, mongodb, etc), Infrastructure (system stats, proc stats, docker, etc), sensors (collectD, Sensu, etc) and networks (UDP, netstats, etc)
  • Seamless integration with 3rd party native metric collectors like CollectD, cAdvisor (Docker), Perfmon, StatsD, Prometheus etc
  • Chronograf and Grafana visualization options for custom dashboards
  • Ability to support collecting metrics from 1000s of servers and network devices
  • Built in plugins for HAProxy, nginx, apache, etc
  • Docker and cAdvisor instrumentation support

 

Collecting OpenStack Data

An OpenStack deployment means having to collect data from disparate systems, services and components. InfluxData’s Telegraf collector supports 30+ inputs and 10+ outputs and can be easily extended to support your sources of data. Telegraf makes collecting data in a format InfluxDB can consume, simple. Here’s why:

  • MIT License
  • Minimal memory footprint
  • Extensible plugin design with 40+ input and output plugins
  • Support for datasources like MongoDB, MySQL and Redis
  • Messaging systems like Apache Kafka and RabbitMQ
  • Third party APIs like Mailchimp, AWS CloudWatch and Google Analytics
  • Collects system metrics like CPU, Memory, I/O, etc


However, the InfluxData platform is extensible by design so you can easily integrate other collection agents like collectd, in conjunction with Telegraf.

Learn more about Telegraf

Storing OpenStack Data

The most popular data in an OpenStack monioring system is going to be in a time-series format. InfluxDB is designed from the ground up to handle just time-series data and to do it better than any other database. InfluxDB is the “I” in the TICK stack. More specifically, InfluxDB is an open source database written in Go to handle time-series data with high availability and high performance requirements. InfluxDB installs in minutes without external dependencies, yet is flexible and scalable enough for complex deployments. Here’s why InfluxDB is the best choice for storing a custom monitoring solution’s time-series data:

  • MIT License
  • Simple to install, yet highly extensible
  • Purpose built for time-series data, no special schema design or custom app logic required
  • Thousands of writes per second with the new TSM1 storage engine
  • Horizontal clustering for high availability in active development
  • A native HTTP API means no server side code to manage
  • Time centric functions and an easy to use SQL­-like query language
  • Data can be tagged, allowing very flexible querying
  • Answer queries in real­time with every data point indexed as it comes in and immediately available in less than 100ms


Learn more about InfluxDB

Visualizing OpenStack Data

If you don’t already have a dashboarding or graphing UI in place, InfluxData provides Chronograf. It’s the “C” in the TICK stack. Chronograf is a downloadable binary you install behind your firewall to collaboratively, yet securely, perform ad-hoc visualizations on your time-series data. Features include:

  • Simple installation and configuration
  • Tight integration with InfluxDB making getting connected to data easy
  • Support for ad-hoc visualizations
  • Smart query builder designed to work with large datasets
  • Collecting multiple graphs into dashboards
  • Templating, new graph types and visualizations coming!


Another visualization UI choice that offers tight integration with InfluxDB is the open source Grafana project. Either choice makes connecting to and visualizing time-series data, simple.

Learn more about Chronograf
Learn more about InfluxDB & Grafana

Processing OpenStack Data

Inevitably, you are going to want to either alert on or in some way process the time-series data being sent by the components in your OpenStack deployment. You’ll want to do this either before it gets written to InfluxDB or when it is retrieved. To address this need, the InfluxData platform ships with the open source Kapacitor project. Kapacitor is the “K” in the the TICK stack. It’s an alerting and data processing engine specifically designed for time-series data. It lets you define your own custom pipeline to aggregate, select, transform or otherwise process data and then store it back in InfluxDB or trigger an event. Features include:

  • MIT licensed
  • Stream data from InfluxDB or query from InfluxDB
  • Trigger events/alerts based on complex or dynamic criteria
  • Perform any transformation currently possible in InfluxQL, for example: SUM, MIN, MAX, etc.
  • Store transformed data back into InfluxDB
  • Process historical data, for example: backfill data using a processing pipeline


Learn more about Kapacitor

Testimonials

InfluxData is used by many OpenStack vendors and practitioners. Check out the Mirantis Fuel project or visit our Testimonials page for a comprehensive list.


Next: Cloud and OpenStack

InfluxCloud

InfluxDB Clusters + Grafana on AWS

14 Day Free Trial

InfluxEnterprise

Highly-Scalable InfluxDB Clusters on Your Infrastructure with a Management UI

Learn More