Infrastructure Monitoring with InfluxDB | Live Demonstration

Watch Now

CAP Theorem

CAP theorem is a computer science theory related to the tradeoffs involved with designing distributed databases

What is CAP Theorem?

When network failure occurs, CAP theorem, also called Brewer’s theorem, describes a trade-off between three components of a distributed datadatabase. These systems can only have two of three desired components: consistency, availability, and partition tolerance.

Coined by Eric Brewer, CAP theorem demonstrates that distributed databases are a key component in performance. More specifically, Brewer realized that the basic unit of structured data—a node—would vary when collected from different locations. This variation is a significant issue within distributed systems. Brewer identified this as a network failure relating to a lack of consistency (C), availability (A), and partition tolerance (P). These are the essential components of the CAP theorem.

Consistency (C) is essential within systems because it is imperative that primary nodes and secondary nodes view the same data. Users, therefore, must identify what consistency means within systems. In basic terms, consistency is when communication within systems is introduced in one way and ends in that same way. If, however, something changes within that communication, the consistency will no longer exist and may result in errors. In order for systems to work effectively, eventual consistency must be constant.

Availability (A) within systems refers to response rate. Essentially, every client receiving a response 100% of the time demonstrates availability. This response should not be negated because of nodes, even when multiple nodes are in use. The idea is that commands are being written and executed. When this does not happen, a system is considered to lack availability. It must be noted Consistency (C) and Availability (A), or CA, can exist concurrently within relational databases as systems are CAPable of having consistency with nodes while remaining available (A) to serve client databases.

Partition Tolerance (P) is the ability of a system to continue to work even when errors may exist. A system working on a single node would result in a total failure if an error happens. A system made up of network partitions can continue to operate despite an error occurring. A network partition optimizes subnets that allows them to operate independently or when a network error occurs. This can only occur, however, if dispensed operating software is Partition Tolerant (P). Put simply, if a communication breakdown happens between two nodes, the system would continue to work despite these two nodes being unable to communicate with each other.

In essence, CAP theorem explains a system can only achieve two out of the three characteristics mentioned, and achieving all three characteristics within one system is not a possibility. In order to truly understand CAP theorem, a discussion of non-relational databases (NoSQL) must be considered. Non-relational databases are collections of nodes that quickly spread across growing networks. NoSQL databases are important as they are defined by the CAP characteristics they encompass: CP databases, AP databases, and CA databases.

Database types in CAP Theorem

CP databases are consistent and partition-tolerant but lack availability. When communication occurs between nodes, the system is able to shut down the node yet keep communication between other nodes within the system consistent if an error happens. As a result of the shut down, availability cannot occur since the error has made a response impossible.

AP databases are available and partition-tolerant but lack consistency. In this model, all nodes remain available for communication between systems, but there is an impact on consistency. Information may lack reliability, or partition may result in a database returning to an older or previous version. This allows for all systems to be functional and available but does not result in consistency.

CA databases are consistent and available, but lack partition tolerance. Essentially, communication exists between all nodes. As long as partition does not exist between any two nodes within the system, consistency and availability will remain in place. The CA database option, however, is the least desirable as it does not account for the ability to shut down errors. In essence, without partition tolerance in place within the network, the entire system would be inoperable if an error occurs.

It is important to mention that NoSQL databases do not require a specific framework. They are known for being user friendly and having wide availability and scalability.

Final word about the CAP Theorem

CAP theorem is an idea derived from the computer science field. It explains that it is not possible to still provide consistency, availability, and partition tolerance congruently when network failure occurs within distributed communication systems. You must choose between two of the three characteristics in the event of a network failure and understand that partition tolerance — the ability of a network to continue to function despite an error — is a strong factor when deciding between CP or AP databases.

FAQs

What is the CAP Theorem in NoSQL?

CAP theorem explains that NoSQL databases cannot provide consistency and high availability concurrently. Computer scientist Eric Brewer discovered this theory.

What Does C stand for in the CAP Theorem?

CAP is an acronym. The ‘C’ stands for consistency, the ‘A’ stands for availability, and the ‘P’ stands for partition tolerance.

Take charge of your operations and lower storage costs by 90%

Get Started for Free Run a Proof of Concept

No credit card required.

quote-shape

Related resources


DBU logo

Free InfluxDB Training

Jump start your InfluxDB journey with free self-paced & instructor-led training.

dbu-illustration