Save 96% on Data Storage Costs
By Rick Spencer / Jul 05, 2023 / InfluxDB
Painful tradeoffs between utility and costs
Users with real-time and other analytic workloads want or need to keep large volumes of historical data to aid in important activities, such as ad hoc historical trend analysis and training AI models.
However, storing this much data in a way that also makes it easily queryable becomes prohibitively expensive. As a result, users must balance data availability and usability with sacrificing data fidelity and storage costs.
That is until now. With InfluxDB 3.0, users don’t need to choose between the data they need and storage costs. We designed it so that users can keep and query large quantities of data in a cost-effective manner.
How InfluxDB works
InfluxDB 3.0 uses Parquet files as the underlying file storage. InfluxDB engineers spent two years fine-tuning InfluxDB 3.0 to squeeze every bit of efficiency out of Parquet. The result of their efforts is that data stored in InfluxDB 3.0 uses less space on disk for time series data than any other database.
Not satisfied with compression gains alone, the InfluxDB engineers took it a step further, designing InfluxDB 3.0 for a distributed environment. That means that InfluxDB can store those Parquet files in inexpensive object storage, while maintaining high performance querying of that data. This effectively eliminates the need to make trade-offs between data storage costs, availability, and fidelity.
The math is simple
So, why does the title claim to save up to 96% on storage costs? We didn’t pull the 96% figure from the air. It comes from a real customer’s experience. This customer was collecting large amounts of IoT device data using the TSM storage engine in InfluxDB OSS 1.8 and 2.0.
Let the bytes on disk be x. Then, InfluxDB 3.0 compresses down to one half the size of space on disk for the same amount of data. Then, the Object Storage costs are about 7.69% of attached disks.
= compression_factor * ratio_of_object_storage_costs
= 2 * 7.69%
~= x * 4%
It’s important to note here that InfluxDB 1.8 already compresses data efficiently on disk. If the customer was using a solution like ClickHouse, the savings effect would be much more pronounced.
The benefits of InfluxDB 3.0 over its open source predecessors are dramatic and significant. Try it for yourself.