Why Use a Purpose-Built Time Series Database
By
Cole Bowden
Developer
Dec 24, 2025
Navigate to:
A time series database has a straightforward definition: it’s a database purpose-built for efficiently ingesting, storing, and querying time series data. Time series data is any data with a timestamp, collected regularly or periodically, that you’ll often visualize on graphs where the X-axis is time. This definition doesn’t quite tell you what sets it apart from other types of databases, though. This blog is going to dive into the details of how various databases are architected and help you understand why you’d want to use a time series database to handle your relevant workloads.
A quick database history lesson
Databases have existed in some capacity since the 1960s, but the first true databases in the modern sense were relational, transactional databases like Ingres and the original SQL Server. These databases store and represent data in rows and columns, and the paradigms underpinning them still exist in their successors, transactional databases like Postgres and MySQL. You might see these databases described as handling OLTP (Online Transaction Processing) because they can quickly process and write new transactions to a dataset. They do this by writing each new row to storage as one cohesive unit at the end of the table in storage, and the database’s size has a negligible impact on how long it takes to add a new row, no matter how large it gets.
The downside is that when you analyze larger volumes of data, even a simple filter like WHERE userID = 1 in a query could require opening up every single row to check if the userID is or is not 1. Modern transactional databases have a number of strategies to speed up queries like this, the most prominent being indexing, which can make retrieving single rows efficient. However, as you scale queries up and try to retrieve more rows for analysis, performance slows down because the engine needs to read a lot of unnecessary data on disk just to find the data the query is looking for.
For scenarios where that became undesirable, the next logical step in database development was columnar databases, which store data in columns instead of rows. You’ll also see them described as handling OLAP (Online Analytical Processing). In a columnar database, data for a given column is colocated in storage next to other data in that column. When you start querying your data with filters, joins, aggregations, and more sophisticated analytical logic, storing data in columns has a massive upside. The WHERE userID = 1 example from above only requires looking at the userID column. If the column is indexed, it should be sorted, allowing the engine to quickly find exactly the rows the user is looking for and then retrieve the other columns’ data for those rows.
The downside is that when you write a row, you have to find the “end” of each column and add that row’s data to each column, which is a slower process. It also means that when you go to retrieve a single row, you need to pull that row’s data out of each column in storage, which is also a little slower. With indexing and a wide variety of other optimizations, this isn’t a big deal if you’re writing a single row, but as write volume and frequency increase, you can hit bottlenecks, and read performance will suffer—sometimes greatly.
NoSQL & time series databases
As data has grown exponentially and use cases have become increasingly varied, the need for more specialized databases has caused even more divergence. Google released the non-relational (often said as NoSQL) database BigTable in 2005, using key-value storage rather than rows and columns. Since then, countless NoSQL databases, including prominent names like MongoDB, Cassandra, and Neo4j, have emerged. These databases can perform a wide variety of functions, though many have the performance characteristics of OLTP databases: designed to handle and write large volumes of data, but for more specialized use cases, whether that involves more powerful and flexible schemas or modeling the data in more intuitive ways.
Some NoSQL databases use columnar storage to achieve efficient analytical query performance while not being strictly relational underneath the hood to better handle performance for their specific use case. The most prominent example of this happens to be our favorite: InfluxDB.
What time series databases do differently
Every database is built for a purpose, and time series databases are no different. Because they’re purpose-built to handle time series workloads, they’re able to do a lot of things that general-purpose transactional and analytical databases can’t.
In many general use cases, a single missing datapoint can be catastrophic. This means general-purpose databases have to have many checks in place to ensure that every write is handled correctly, with no interference from other writes and zero risk of data loss. With time series data, a single reading from a sensor that’s making 7200 measurements per hour likely isn’t mission-critical, and one missing datapoint won’t impact a user’s ability to monitor trends.
By not committing to full ACID compliance in favor of eventual consistency, a time series database can avoid tricky snags like contention locks. This paradigm shift enables huge improvements in write performance, allowing them to significantly outpace even highly performant OLTP databases in write throughput. The same paradigm shift also allows a time series database to make new data available in real-time for querying. There is a tradeoff between performance and durability. Many databases have to absolutely maximize durability for scenarios where a single row represents important business data. Time series databases can confidently sacrifice a small amount of durability for a massive gain in performance.
As another example, nearly all general-purpose databases use indexing to speed up their query performance. Even with sparse indexing, a strategy where only certain values are indexed, increasing cardinality (the number of unique values) can hamper performance. When the number of unique indexed values is so large that loading the index into memory takes up a meaningful portion of available memory, query performance suffers. If the index uses all of the memory, the database falls over.
InfluxDB 3 solves this problem by eschewing indexes. By storing relevant data in columnar storage, there are no concerns about running low on memory due to high cardinality creating bloated indexes. Though a lack of indexing would cause a hit to performance in a general use case, Influx partitions and stores data on disk sorted by timestamp with numerous optimizations for time-based queries, so performance doesn’t suffer. Any queries that filter for certain time periods can efficiently prune all unnecessary data, and thanks to a number of time-based optimizations, an aggregation over a large time period is faster than it would be in a standard database. The lack of indexing allows InfluxDB to work with unlimited cardinality, which in turn means InfluxDB can store and query datasets with unlimited unique values. This allows InfluxDB to store UUIDs, IP addresses, time series data enriched with relational data, and more. Because of the many optimizations for time-based queries, what would be a pitfall instead accelerates performance.
Time-based functionality
The specific purpose of time series databases also means that they come with more powerful functionality and syntax for navigating time-based queries.
Take this example query written for Postgres which tries to find the 95th percentile for temperature measured by sensors, and which has gap filling for any hours where data may be missing:
WITH hours AS (
SELECT generate_series(
date_trunc('hour', now() - interval '24 hours'),
date_trunc('hour', now()),
interval '1 hour'
) AS hour_bucket
),
sensors AS (
SELECT DISTINCT sensor_id FROM sensor_data
),
hour_sensor AS (
SELECT h.hour_bucket, s.sensor_id
FROM hours h
CROSS JOIN sensors s
),
agg AS (
SELECT
sensor_id,
date_trunc('hour', time) AS hour_bucket,
percentile_cont(0.95) WITHIN GROUP (ORDER BY temperature) AS p95
FROM sensor_data
WHERE time >= now() - interval '24 hours'
GROUP BY sensor_id, hour_bucket
)
SELECT
hs.hour_bucket,
hs.sensor_id,
COALESCE(a.p95, 0) AS p95
FROM hour_sensor hs
LEFT JOIN agg a USING (hour_bucket, sensor_id)
ORDER BY hs.sensor_id, hs.hour_bucket;
In InfluxDB 3, the same query can be expressed as:
SELECT
date_bin_gapfill(INTERVAL '1 hour', time) AS hour,
sensor_id,
interpolate(percentile(temperature, 95)) AS p95
FROM sensor_data
WHERE time >= NOW() - INTERVAL '24 hours'
GROUP BY hour, sensor_id;
Nanosecond timestamp precision, a wide variety of date and time functions, and easy integrations via Telegraf with all of the common methods for collecting time series data make InfluxDB easy to use. InfluxDB 3 makes recent writes immediately available for real-time analysis by sending them to a queryable, in-memory buffer—something ACID-compliant databases can’t do. Built-in tools for data lifecycle management with retention policies and easy downsampling also make it easy to save big on storage costs.
The tradeoff to all of these upsides is that a time series database is a specific tool that isn’t right for every use case. If any single row of data represents a piece of mission-critical business information, you shouldn’t use a time series database. If you have a small volume of data that a simple Postgres installation can handle well, you don’t need a time series database. If your data doesn’t have timestamps, you can’t use a time series database.
The takeaway
For scenarios where you’re collecting and analyzing large volumes of data over time, a time series database like InfluxDB is purpose-built to be the best tool for the job. Write throughput is unparalleled, datasets with high cardinality pose no issue, queries and analytics are highly-performant, and you have a full suite of tools to make it easier to work with time series data. When a time series database is right for your workloads, no other type of database can compare. Get started with InfluxDB 3 Core or Enterprise for free today.