How to Run a Time Series Database on Azure
By Al Sargent / May 20, 2020 / Community, Developer, InfluxDB Enterprise, Azure
Today we’re pleased to announce the general availability of InfluxDB Enterprise on Microsoft’s Azure Marketplace. This brings the industry’s leading time series database to Azure Marketplace, which lets you:
- Instantly provision and deploy InfluxDB on any of 33 Azure regions
- Pay for InfluxDB with your Microsoft Azure account using integrated billing
By running InfluxDB Enterprise on Azure, you get:
- Broad observability of over 900 IoT, on-prem and cloud technologies
- Highly configurable queries, anomaly detection and alerting to quickly find and fix production issues with your IoT devices and application infrastructure
- Data flexibility, including both warm and cold data analytics, non-expiring data storage, and scalability to billions of events
We’ll dive into all of these below, but first, let’s take a step back in case you’re not familiar with time series databases.
What to look for in a time series database?
If you’re looking for a time series database, here are three things to look for:
Deployment flexibility. Time series databases are often used to collect data from IoT sensors and equipment deployed at the edge of a network. In many cases, it makes sense to first store data on-device, then forward it to a central time series database, as a way to protect against network outages.
Time series databases are also used to monitor server infrastructure, network equipment and application performance. For many companies, the move to a cloud provider like Azure runs over multiple years. For them, they need a hybrid cloud architecture that lets them use the same database to store monitoring data on-prem or in the cloud.
A third type of use case for time series databases is storing user experience data. In order to locate a database close to your target markets, as well as to manage privacy and compliance issues, you need the flexibility to store your time series data in your choice of countries or states.
Flexible analytics. At the end of the day, you’re storing time series data to run your business better. A time series database should provide the ability to run continuous queries so that you can be instantly alerted of issues related to, say, data from a medical device, a connected car, or an unresponsive application. You should have the ability to aggregate data so that it takes up less disk storage, detect trends from noisy data sets, and extrapolate data to forecast outages and capacity issues.
Scalability. Time series data is typically produced in volumes that are orders of magnitude larger than relational data. And this makes sense relational databases are great for storing transactional data, such as credit card purchases. But even on your busiest shopping days, you’re probably not making more than a dozen purchases a day.
In contrast, companies collecting sensor data from a connected car, or memory usage of a Docker container, might collect data points a dozen times a minute. Over a 24-hour period, that’s over a thousand times more data points than your shopping spree.
The scalability required here calls for a database that can ingest large amounts of data, store it for a long time and query it quickly.
Why use InfluxDB on Azure?
So, how does InfluxDB measure on each of these three criteria? Let’s take a look.
Deployment flexibility. Customers choose Azure because it excels at hybrid deployments that exist on-premises, in the cloud and at the edge. InfluxDB can run on your on-premises infrastructure, on edge devices, and in 33 Azure regions worldwide:
|Amsterdam, Netherlands||Busan, Korea||Blue Ridge, Virginia|
|Dubai, UAE||Canberra, Australia||Boydton, Virginia|
|Dublin, Ireland||Chennai, India||Chicago, Illinois|
|Paris, France||Hong Kong||Des Moines, Iowa|
|Frankfurt, Germany||Melbourne, Australia||Quebec City, Canada|
|Liverpool, England||Mumbai, India||Quincy, Washington|
|Oslo, Norway||Osaka, Japan||Salt Lake City, Utah|
|Portsmouth, England||Pune, India||San Antonio, Texas|
|Pretoria, South Africa||Seoul, Korea||San Francisco, California|
|Zurich, Switzerland||Singapore||Sao Paulo, Brazil|
|Sydney, Australia||Toronto, Canada|
Flexible analytics. InfluxDB provides a broad range of data analytics options to help you gain insights into your business. These include:
- Warm analytics using continuous queries to analyze incoming time series data and immediately alert on issues
- Cold analytics storing time series data as long as you need; whether it's months to determine SLA compliance or years for regulatory purposes
- Contextualization of data using InfluxDB tags to represent hierarchies, relationships and properties no schema design needed
- Ad-hoc analysis of IoT and monitoring data using Data Explorer to analyze data by any type of asset tag
<figcaption> Figure 1 - InfluxDB Data Explorer lets you do ad-hoc analysis of your time series data.</figcaption>
- Calculating percentiles to detect SLA compliance failures
- Windowing and aggregating data to pick out insights from noisy data sets
- Enriching monitoring data with business data in SQL databases, like account name, type, or size, to detect anomalies by business measures
- Forecasting with Holt-Winters to predict outages and capacity issues
- Geographically tracking monitoring metrics to better determine which regions are experiencing problems
Scalability. Regardless of which cloud you run on (or if you run on-premises), InfluxDB is engineered for scalability. Looking across our entire customer base and all deployment modes, Hulu sends 1 million metrics a second to InfluxDB. Wayfair also sends 1 million metrics/second. For both, that’s 84 billion metrics per day. CERN, which operates the world’s most powerful particle accelerator in the world, sends 3.4 terabytes per second to InfluxDB. And Tesla uses InfluxDB to track real-time data on over 50,000 Powerwalls.
There’s a lot more we can say about InfluxDB’s scalability, and if you’d like to go deeper, you can review these performance comparisons of InfluxDB versus MongoDB, Elasticsearch, OpenTSDB, Cassandra, and Graphite.
What can I monitor with InfluxDB on Azure?
A lot! InfluxDB can monitor and analyze over 900 different cloud, on-prem and IoT technologies so you can ensure all your systems perform at their best. This includes nearly 200 Telegraf plugins and over 700 FluentD plugins a broad range of alternatives.
Additionally, you can send monitoring data straight into InfluxDB using client libraries which are available in several different languages:
Here are Telegraf’s Microsoft- and Azure-specific plugins:
- Azure Event Hub (input)
- Azure Storage Queues (input)
- SQL Server (input)
- Windows Services (input)
- Windows Performance Counters (input)
- Azure Application Insights (output)
- Azure Monitor (output)
And here are Telegraf’s IoT plugins:
- IPMI Sensors
- JTI OpenConfig
- Neptune Apex
- Linux monitoring sensors
- Server temperature
Best of all, Telegraf is open source. So if there’s something you want to monitor that Telegraf doesn’t already support, you can build your own plugin. In fact, most Telegraf plugins are built by an open-source community of nearly 700 contributors.
<figcaption> Figure 2 - Just a few of the nearly 700 Telegraf contributors</figcaption>
How does InfluxDB integrate with Azure?
There are three main ways in which InfluxDB integrates with Azure.
Second, data egress. InfluxDB can use Telegraf to send data to Azure Application Insights and Azure Monitor. This enables both Azure Monitor and Application Insights to benefit from Telegraf’s broad monitoring coverage of over 200 plugins.
Third, and most significantly, Azure Monitor can store data in InfluxDB. This provides a couple of benefits:
- It lets you store data for years, at no additional cost. This is important for consumer and industrial IoT use cases, where data often needs to be stored longer than several years to comply with regulatory requirements and manage liability risks. For example, regulators might need to access old data to understand why an oil rig or consumer product has stopped working.
- When Azure Monitor stores data in InfluxDB Enterprise, you benefit from the sophisticated analytics available via InfluxQL and Flux that we discussed above.
How much does InfluxDB cost on Azure?
InfluxDB Enterprise in the Azure Marketplace has flexible hourly pricing starting at $0.64/core/hour. Beyond hourly pricing, you can reduce the total cost of ownership to manage your time series data using InfluxDB over setting it up on your own. That’s because:
- Launching InfluxDB Enterprise from Azure Marketplace simplifies provisioning and deployment.
- InfluxDB is highly scalable and frees you from time-consuming tuning and optimization projects.
- InfluxDB is engineered for developer productivity, letting you quickly get started with ingesting, analyzing and acting on time series data. We call this "time to awesome," and you can see what that means in this demo video:
If you’d like to discuss annual contracts or larger commitments, email us at [email protected].
Why run InfluxDB through Azure Marketplace?
When you access InfluxDB through Azure Marketplace, you consolidate all your InfluxDB expenditures, for all database instances across all teams into a single bill. This lets you eliminate tedious tasks so you can focus on what matters gaining insights into your business using time series data.
To be more specific, InfluxDB on Azure provides:
Developer focus Without Azure Marketplace billing, your developers and SREs have to put Azure charges on their credit cards and later expense them. That’s one more administrative task to pull them from their work. Our Azure Marketplace integration gives them a fast and easy way to spin up the time series database instances needed to be productive. And they don’t need to deal with time-consuming expense reports.
Pain-free budget reports All your InfluxDB charges, for all accounts, appear on your existing Azure bill, so they’re easier to track. For engineering and IT managers, this makes it easier to generate budget spending reports so they can focus on what matters delivering projects on time, and keeping critical services running.
Streamlined purchase process There’s no need to work with your purchasing department to onboard InfluxData as a new vendor or deal with the hassle of setting up a new contract. If you’re already using Azure, then you remain with a single vendor Microsoft for both your Azure and InfluxDB usage.
Streamlined support When you’re trying to keep a development project on track, or a critical service running, you know that any downtime can throw a wrench into your plans. When outages occur, the InfluxDB and Microsoft Azure teams work closely to quickly pinpoint the root cause to minimize downtime. Also, since InfluxDB is more than just a database, it covers a broad range of the time series data pipeline, including data acquisition, analysis, visualization, alerting, and (because we don’t believe in lock-in) data egress. This lets you avoid the time-consuming process of coordinating multiple vendors to address service outages.
How to use InfluxDB on Azure?
To run InfluxDB on Azure, just head over to InfluxDB Enterprise in the Azure Marketplace. Click “get it now,” and you’re off to the races.