Yash: Papa, do
you have few minutes!!
Me: Yes. What is
the issue!!
Yash: Nothing. I was wondering what is CAP theorem? I was
trying to understand it but it is confusing. It talks about Consistency, Availability
and Partition Tolerance in Distributed Systems. I have little idea about
distributed systems but CAP Theorem ….
Me: It should not
be a challenge. First tell me what do you know about Distributed Systems?
Yash: In distributed systems, various components of software
systems run on different machines. So, if one component needs to be scaled up,
that can be done individually without affecting remaining components.
Me: OK. Give me
an example.
Yash: Let us
assume, I have a Java program which also has database. So as you explained to
me in MVC pattern (link to MVC here) there will be few distinct components:
- JSP s representing View
- Some servlets which will be holding controller logic
- Java Beans which will be talking to database
- Database to hold data
and....
Me: Hold on. Add
one more component in your list.
- Images which will be displayed in browser with JSPs.
Yash: OK. So I
have 5 components. So potentially I can host my application on five different
computers. One machine for each component. And suppose with time, due to
increased data, I can add one more machine for database. So, totaling 6
machines. Here I am taking benefit of distributive computing.
Me: Fantastic!!
Me: Should be
talk about CAP Theorem?
Yash: Yes. I want
to know it.
Me: OK. So we
will start from non-computer life. One
day you wake up and had a great idea to start your own company. Its name will be AskMe. It has very simple
service to offer. People will call AskMe and give their reviews about
restaurants. Also they can ask AskMe to give reviews about a particular
restaurant as given by other reviewers.
Since you were very enthusiastic about this idea. , so you
talk with your friends in school. Few of your friends become member of AskMe.
To keep track of each member and review you started to write details in a Note
book.
AskMe service is unique and people liked the idea of
restaurants review over phone, word of mouth helped. AskMe started to gain new
customers. With increased customer base, you started to miss calls, which is
not good for business.
You rope in your brother in AskMe business. You wanted that
customer should have only one telephone number for AskMe, so got another line
with same telephone number and internally rote the calls depending upon who is
free (You and your brother). You also gave a separate note book to your brother
where he can jot down reviews and also read the reviews if asked. You were
happy.
But within few days, you realized that some restaurants are
in your note book and some other in your brother’s note book. It means AskMe is
not giving correct reviews to its customer. You and your brother think through
the problem at hand and decided that every day in morning, both need to tally
each other note book, and ensure that both note books are in sync.
This arrangement worked but after few days, you and your
brother have small fight and your brother decided not to sync his note book
with yours. AskMe is in trouble. Also during non-fight days you noticed that
AskMe cannot start working till both of you sync your booklets and invariably
both note books are out of sync during day time.
Now let us list the challenges AskMe is facing:
- Both notebooks may not be in sync all the time. This may not happen because one of you may decide not to sync due to any reason
- If during day time, if you both decide to sync your note books, during that time AskMe service will not be available
- One of you may not show at work which will result in missed calls
Yash: It seems to
me that out of these three challenges only two can be solved at a given point
of time.
- If AskMe decide to sync all notes book all the time, few calls will be missed.
- If at least one of us decides not to show for work again few calls will be missed.
- If AskMe add more people to take calls, there will be more time consumed to sync up the note books, again few calls will be missed.
It seems to me that there is no perfect solution.
Me: This is CAP
Theorem. C is for Consistency, A is for Availability, and P is for Partition Tolerance.
In any distributed computer system, all three cannot be achieved.
Yash: Does it mean, all distributed computer systems are not
perfect in terms of Consistency, Availability and Partition Tolerance?
Me: Yes, but
there are techniques to reduce the impact.