Unlimited-Data. moved to lab.itbee.vn : The CAP Theorem... Again

The CAP Theorem... Again: "
Today looks to be (again) the day of the CAP theorem^[1]^[2], so let’s do a quick summary:

We had Coda Hale’s ☞ You can’t sacrifice partition tolerance:

Of the CAP theorem’s Consistency, Availability, and Partition Tolerance, Partition Tolerance is mandatory in distributed systems. You cannot not choose it. Instead of CAP, you should think about your availability in terms of yield (percent of requests answered successfully) and harvest (percent of required data actually included in the responses) and which of these two your system will sacrifice when failures happen.
Jeff Darcy followed up with ☞ Another CAP article:

It seems to me that there is a consensus emerging. Even if Gilbert and Lynch only formally proved a narrower version of Brewer’s original conjecture, that conjecture and the tradeoffs it implies are still alive and well and highly relevant to the design of real working systems that serve real business needs.

and ☞ Reactions to Coda’s CAP post:

The last point is whether CAP really boils down to “two out of three” or not. Of course not, even though I’ve probably said that myself a couple of times. The reason is merely pedagogical. It’s a pretty good approximation, much like teaching Newtonian physics or ideal gases in chemistry. You have to get people to understand the basic shape of things before you start talking about the exceptions and special cases, and “two out of three” is a good approximation. Sure, you can trade off just a little of one for a little of another instead of purely either/or, but only after you thoroughly understand and appreciate why the simpler form doesn’t suffice. The last thing we need is people with learner’s permits trying to build exotic race cars. They just give the doubters and trolls more ammunition with which to suppress innovation.
Henry Robinson’s ☞ CAP Confusion: Problems with ‘partition tolerance’ popped up too:

Not ‘choosing’ P is analogous to building a network that will never experience multiple correlated failures. This is unreasonable for a distributed system – precisely for all the valid reasons that are laid out in the CACM post about correlated failures, OS bugs and cluster disasters – so what a designer has to do is to decide between maintaining consistency and availability. Dr. Stonebraker tells us to choose consistency, in fact, because availability will unavoidably be impacted by large failure incidents. This is a legitimate design choice, and one that the traditional RDBMS lineage of systems has explored to its fullest, but it implicitly protects us neither from availability problems stemming from smaller failure incidents, nor from the high cost of maintaining sequential consistency.
Many of the above articles were referring to Michael Stonebraker’s ☞ Errors in Database Systems, Eventual Consistency, and the CAP Theorem:

In summary, one should not throw out the C so quickly, since there are real error scenarios where CAP does not apply and it seems like a bad tradeoff in many of the other situations.

So, we pretty much went full circle. I just hope that Eric Brewer will do ☞ follow up:

I really need to write an updated CAP theorem paper

Michael Stonebraker’s clarifications on the CAP theorem and the older but related Daniel Abadi’s Problems with CAP
(↩)
Nati Shalom’s ☞ NoCAP and my own NoCAP… is wrong (nb make sure you also read the comments)
(↩)

Original title and link: The CAP Theorem… Again (NoSQL databases © myNoSQL)

Unlimited-Data. moved to lab.itbee.vn

Saturday 23 October 2010

The CAP Theorem... Again

No comments:

Post a Comment

Labels