Comparison / Redis vs AWS Redshift

Redis vs AWS Redshift

A detailed comparison

Compare Redis and AWS Redshift for time series and OLAP workloads

Choosing the right database is a critical choice when building any software application. All databases have different strengths and weaknesses when it comes to performance, so deciding which database has the most benefits and the most minor downsides for your specific use case and data model is an important decision. Below you will find an overview of the key concepts, architecture, features, use cases, and pricing models of Redis and AWS Redshift so you can quickly see how they compare against each other.

The primary purpose of this article is to compare how Redis and AWS Redshift perform for workloads involving time series data, not for all possible use cases. Time series data typically presents a unique challenge in terms of database performance. This is due to the high volume of data being written and the query patterns to access that data. This article doesn’t intend to make the case for which database is better; it simply provides an overview of each database so you can make an informed decision.

Redis vs AWS Redshift Breakdown


Database Model	In-memory database	Data warehouse
Architecture	Redis can be deployed on-premises, in the cloud, or as a managed service	AWS Redshift utilizes a columnar storage format for fast querying and supports standard SQL. Redshift uses a distributed, shared-nothing architecture, where data is partitioned across multiple compute nodes. Each node is further divided into slices, with each slice processing a subset of data in parallel. Redshift can be deployed in a single-node or multi-node cluster, with the latter providing better performance for large datasets.
License	BSD 3	Closed source
Use Cases	Caching, message brokering, real-time analytics, session storage, geospatial data processing	Business analytics, large-scale data processing, real-time dashboards, data integration, machine learning
Scalability	Horizontally scalable via partitioning and clustering, supports data replication	Supports scaling storage and compute independently, with support for adding or removing nodes as needed

Redis Overview

Redis, which stands for Remote Dictionary Server, is an open-source, in-memory data structure store that can be used as a database, cache, and message broker. It was created by Salvatore Sanfilippo in 2009 and has since gained significant popularity due to its high performance and flexibility. Redis supports various data structures, such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, and geospatial indexes with radius queries.

AWS Redshift Overview

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. It was launched in 2012 as part of the AWS suite of products. Redshift is designed for analytic workloads and integrates with various data loading and ETL tools, as well as business intelligence and reporting tools. It uses columnar storage to optimize storage costs and improve query performance.

Redis for Time Series Data

Redis has a dedicated module for working with time series data called RedisTimeSeries. RedisTimeSeries offers functionality like downsampling, data retention policies, and specialized queries for time series data in Redis. Being an in-memory database, Redis will be very fast for reading and writing time series data, but due to the cost of RAM compared to disk using Redis could become expensive depending on the size of your dataset. If your use case doesn’t require extremely fast response times, you could save money by going with a more traditional time series database.

AWS Redshift for Time Series Data

AWS Redshift can be used for time series data workloads, although Redshift is optimized for more general data warehouse use cases. Users can utilize date and time-based functions to aggregate, filter, and transform time series data. Redshift also offers ‘time-series tables’ which allow data to be stored in tables based on a fixed retention period.

Redis Key Concepts

In-memory store: Redis stores data in memory, which allows for faster data access and manipulation compared to disk-based databases .
Data structures: Redis supports a wide range of data structures, including strings, hashes, lists, sets, and more, which provide flexibility in how data is modeled and stored.
Persistence: Redis offers optional data persistence, allowing data to be periodically saved to disk or written to a log for durability.
Pub/Sub: Redis provides a publish/subscribe messaging system, enabling real-time communication between clients without the need for a centralized message broker.

AWS Redshift Key Concepts

Cluster: A Redshift cluster is a set of nodes, which consists of a leader node and one or more compute nodes. The leader node manages communication with client applications and coordinates query execution among compute nodes.
Compute Node: These nodes store data and execute queries in parallel. The number of compute nodes in a cluster affects its storage capacity and query performance.
Columnar Storage: Redshift uses a columnar storage format, which stores data in columns rather than rows. This format improves query performance and reduces storage space requirements.
Node slices: Compute nodes are divided into slices. Each slice is allocated an equal portion of the node’s memory and disk space, where it processes a portion of the loaded data.

Redis Architecture

Redis is a NoSQL database that uses a key-value data model, where each key is associated with a value stored as one of Redis’ supported data structures. The database is single-threaded, which simplifies its internal architecture and reduces contention. Redis can be deployed as a standalone server, a cluster, or a master-replica setup for scalability and high availability. The Redis Cluster mode automatically shards data across multiple nodes, providing data partitioning and fault tolerance.

AWS Redshift Architecture

Redshift’s architecture is based on a distributed and shared-nothing architecture. A cluster consists of a leader node and one or more compute nodes. The leader node is responsible for coordinating query execution, while compute nodes store data and execute queries in parallel. Data is stored in a columnar format, which improves query performance and reduces storage space requirements. Redshift uses Massively Parallel Processing (MPP) to distribute and execute queries across multiple nodes, allowing it to scale horizontally and provide high performance for large-scale data warehousing workloads.

Free Time-Series Database Guide

Get a comprehensive review of alternatives and critical requirements for selecting yours.

Download now

Redis Features

Atomicity

Redis supports atomic operations on complex data types, allowing developers to perform powerful operations without worrying about race conditions or other concurrent processing issues.

Broad data structure support

Redis supports a range of data structures such as lists, sets, sorted sets, hashes, bitmaps, hyperloglog, and geospatial indexes. This flexibility allows developers to use Redis for a wide variety of tasks by using data structures that are optimized for their data in terms of performance characteristics.

Pub/Sub messaging

Redis provides a publish/subscribe messaging system for real-time communication between clients.

Lua Scripting

Developers can run Lua scripts in the Redis server, enabling complex operations to be executed atomically in the server itself, reducing network round trips.

AWS Redshift Features

Scalability

Redshift allows you to scale your cluster up or down by adding or removing compute nodes, enabling you to adjust your storage capacity and query performance based on your needs.

Performance

Redshift’s columnar storage format and MPP architecture enable it to deliver high-performance query execution for large-scale data warehousing workloads.

Security

Redshift provides a range of security features, including encryption at rest and in transit, network isolation using Amazon Virtual Private Cloud (VPC), and integration with AWS Identity and Access Management (IAM) for access control.

Redis Use Cases

Caching

Redis is often used as a cache to store frequently accessed data and reduce the load on other databases or services, improving application performance and reducing latency.

Task queues

Redis can be used to implement task queues, which are useful for managing tasks that take longer to process and should be executed asynchronously. This is particularly common in web applications, where background tasks can be processed independently of the request/response cycle

Real-time analysis and machine learning

Redis’ high performance and low-latency data access make it suitable for real-time analysis and machine learning applications, such as processing streaming data, media streaming, and handling time-series data. This can be achieved using Redis’ data structures and capabilities like sorted sets, timestamps, and pub/sub messaging.

AWS Redshift Use Cases

Data Warehousing

Redshift is designed for large-scale data warehousing workloads, providing a scalable and high-performance solution for storing and analyzing structured data.

Business Intelligence and Reporting

Redshift integrates with various BI and reporting tools, enabling organizations to gain insights from their data and make data-driven decisions.

ETL and Data Integration

Redshift supports data loading and extraction, transformation, and loading (ETL) processes, allowing you to integrate data from various sources and prepare it for analysis.

Redis Pricing Model

Redis is open-source software, which means it can be deployed and used freely on your own infrastructure. However, there are also managed Redis services available, such as Redis Enterprise which offer additional features, support, and ease of deployment. Pricing for these services typically depends on factors like the size of the instance, data storage, and data transfer.

AWS Redshift Pricing Model

Amazon Redshift offers two pricing models: On-Demand and Reserved Instances. With On-Demand pricing, you pay for the capacity you use on an hourly basis, with no long-term commitments. Reserved Instances offer the option to reserve capacity for a one- or three-year term, with a lower hourly rate compared to On-Demand pricing. In addition to these pricing models, you can also choose between different node types, which offer different amounts of storage, memory, and compute resources.

Get started with InfluxDB for free

InfluxDB Cloud is the fastest way to start storing and analyzing your time series data.

Get Started Now