What is CAP theorem ?
CAP theorem demonstrates a distributed system can not accomodate 3 status at the same time. These status are consistency, availability and partition tolerance. In means we can prefer only both of them at the same time.
Let's take a look at these 3 terms first.
What is consistency ?
Consistency is a specification which a system will return most last updated data.
What is availability?
Availability is meaning a system can respond the request at every request.
What is partition tolerance?
Partition tolerance is meaining a distributed system can respond even if some of nodes have failures.
So, Why can we prefer only two of them ?
Consistency and Availability (CA)
Let's talk about a distributed system. This distributed system have multiple nodes within communication. Sometimes there can be problems on communication system and some datas may not be updated at every nodes. So this condition breaks the consistancy.
Our system may break down the responsing until all datas are synchronized. This will break the availability condition.
As a result if we prefer consistency and availability for our distributed system our system can not have partition tolerance.
Availability and Partition Tolerance (AP)
Let's talk about a distributed system and all nodes in the system are in communication and they are creating responses for the requests. Sometimes there is a problem about communication but we need to create responses for the requests. This means our system is always avail but it does not promise to respond with last updated data. Because nodes may not be synchronized because of communication failure.
A distributed system with AP does not promise the consistency but it promise always create a response even if there is a problem at some of nodes or failure of communication between nodes.
As a result an AP distributed system can not promise consistency but accomodates always availability and partition tolerance.
Consistency and Partition Tolerance (CP)
Let's talk about a distributed system which takes care of consistency. This distributed system promises to respond last updated data always. If there is a problem about synchronization between nodes, the system willl be shut down until consistency ensured. As a result a distributed system which take care of consistency promises the responding last updated data always but does not promise the availability always.
How can I chose a database management system ?
We can talk about relational database systems like Postgress, Oracle, MSSQL for CA database management system. They are working on single server. It is clear there is no partition tolerance in here. They are following ACID model. Also we can talk abot Redis for CA system. It works on RAM and has high availability. Also it is consistant because it follows ACID model.
If we are looking for a database system which take care of consistency and partition tolerance MongoDB is one of most famous one. It is working with a master node and other nodes always sync the datas within each other. All responses are being created from one master node and if there is a problem at master node one of replicas will be master immediatelly.
If our distributed system needs to be avail at every time we can take a look at Cassandra. Cassandra is a masterless database management system. All nodes in Cassandra cluster can answer all requests to provide availability. If there is a failure at one of nodes, one of other nodes will responde of the request which belongs to failured node.
That is all in this article.
Have a good system building
Burak Hamdi TUFAN