Each day, multibillion-dollar organizations and the everyday person alike rely on the cloud to consistently and simply exchange data. Cloud computing has made it very easy for us to use applications without having to think about infrastructure. Needless to say, most software applications on the cloud are designed to make use and take advantage of its distributed and decentralized approach to networking.
The database is a critical part of any software system. In a distributed system, data may be stored in multiple computers located in the same physical location, or dispersed over a network of interconnected computers. Were such a system to be exposed to ransomware, malicious file deletion, or similar, the entire network could be at risk. Thus, it is important to design them (the data stores) and the entire application around the data store, in such a way that the user experience is not affected should something go wrong.
There is no silver bullet which can help us find the perfect data store for our needs. This is where the CAP theorem helps us. The CAP theorem (also known as Brewer’s theorem) gives us a framework for thinking about the trade-offs that are involved when choosing a distributed data store for our cloud computing needs. Studying this informs us about the various problems one might face, and how to think about them (and potentially solve them). It helps us write robust software for the cloud era.
What is the CAP theorem?
The CAP theorem states that out of three desirable properties, it is impossible for a distributed data store to guarantee more than two. These three desirable properties are:
- Partition tolerance
In this blog, we will explore these properties in the context of the CAP theorem.
In a distributed database, the data is physically stored in different computers. The computers may be present in a single data center or spread across different locations. The different computers are connected via network. These different computers are called nodes, and they can communicate with each other using messages to keep the state of the system in sync.
The clients can read and write via any node that is a part of the system. When a node gets a request, it performs operations before responding to the client. A non-error response indicates that the transaction has been completed successfully.
For example, let us assume that we have a two-node cluster. Let us call the nodes `node_1` and `node_2`. This cluster is keeping track of a value `v=0`. This is the initial state of the cluster.
Consistency is the property that ensures every transaction changes the state of the database from a valid state to another valid state. This means that the system returns the correct response to every query.
Therefore, any read operation after a write operation should fetch the latest state, regardless of which nodes processed the write and read.
Let us illustrate this with the example of our two-node cluster:
- The current state of the cluster is `v=0`
- A client queries a node(`node_1`) with a transaction changing the state of the cluster to `v=1`
- The node communicates with the other node(`node_2`), and they both reconcile the new state of the cluster. The new state is `v=1`
- A new client queries the cluster for the current state — since both the nodes have the updated data, the cluster responds with the updated state
This means that once a client writes `v=1` to the cluster (using any node), the subsequent state of the entire cluster should be `v=1`. This state should be independent of which nodes were used for the transaction.
Availability means that the entire database should be available to respond to requests even if some nodes are down. For our cluster, all this means is that if one of the nodes is down, the cluster should continue to respond to queries (reads and writes).
For example, if `node_1` is down, then all the requests that will be re-routed to `node_2` should be able to process the queries correctly without errors.
A network partition happens when some of the nodes are not able to communicate with the rest of the nodes. This means that the messages the nodes use to reconcile the state of the cluster are either not reaching them or being delayed to a point where they are no longer useful.
A partition-tolerant system will continue to be in service during a network partition.
The CAP theorem — proof
The CAP Theorem states that out of the three properties, we can guarantee only two.
What happens if our system satisfies all properties?
The easiest way to test this claim is to introduce a partition into the system.
How does a write operation look in this case?
- Initial state of the cluster `v=0`
- Client talks to `node_1` and changes the state to `v=1`
- The nodes cannot talk to each other so `node_2` does not get the updated state of the cluster
- Now when a client queries `node_2`, the client will not receive the updated state of the cluster. (`v=0`)
We see a trade-off here. In order for the system to be available during a network partition, the state of the database was not consistent. Each node (or partition) had its own state. This can become problematic for clients. On the other hand, if the system were to stop responding to requests during a partition, the system would not be able to guarantee partition tolerance.
Thus, a system can not guarantee all three properties.
Most distributed database systems are partition tolerant by default, because this is the expectation in the cloud-native landscape. This leaves us with consistency and availability as parameters we can change and adjust as needed. Some database systems even allow the user to select the behavior of the database by using configuration parameters. We should also understand that these aren’t absolutes. There is a spectrum between availability and consistency, and many systems offer users the choice to select a point along that continuum.
Apache Cassandra is a very good example of a distributed database system that has chosen to guarantee availability, partition-tolerance, and trade-off consistency. Cassandra guarantees eventual consistency, which means that after a certain amount of time, all nodes will have the same cluster state. A very popular example of an eventually-consistent system is the domain name system (DNS). That is why any updates to the DNS need time before they propagate across the entire planet.
Cassandra allows a user to choose the consistency level during reads/writes. If you choose to have strong consistency over your writes, then the system will make sure all updates are propagated to all replicas before marking the transaction as a success. Similarly, if you choose to have strong consistency during reads, the system would make sure that all replicas have the same copy of the data before responding to the client. Both these situations trade availability for consistency.
The CAP theorem is all about trade-offs. Since different systems have different requirements, the trade-offs that they choose will be different. Some databases may be required to absolutely guarantee availability, while relaxing database consistency needs. Others may need to guarantee consistency at the cost of not being highly available.
The goal is to understand the theory so we may be more informed about the trade-offs.
We choose consistency over availability when data consistency is very important. This is a typical approach in financial institutions. In these situations, we want the data to be very consistent and are willing to tolerate downtime. Banks typically will have a predetermined downtime window where they are allowed to perform database maintenance.
On the other hand, we choose availability when the uptime of the system is more important than consistency. This is typically seen in systems which collect large amounts of data, such as a metrics database, which collects telemetry data from smart devices or distributed microservices. In this scenario, reads are not followed closely by the writes, and some inconsistency is okay as long as the system is not refusing any writes. Dropping data would lead to more inaccuracies in understanding the monitored system.
However, these are just generic scenarios, and there are companies that may run Cassandra (which is an eventually-consistent database) for all their data storage needs — including banking data.
Explore innovations and best practices in cloud backup and data management — such as the CAP theorem for distributed databases — in the Tech/Engineering section of Druva’s blog archive.