Tuesday 1 February 2011

Why Netflix Picked Amazon SimpleDB, Hadoop/HBase, and Cassandra

Why Netflix Picked Amazon SimpleDB, Hadoop/HBase, and Cassandra: "Why Netflix Picked Amazon SimpleDB, Hadoop/HBase, and Cassandra:

Yury Izrailevsky[1]:



The reason why we use multiple NoSQL solutions is because each one is best suited for a specific set of use cases. For example, HBase is naturally integrated with the Hadoop platform, whereas Cassandra is best for cross-regional deployments and scaling with no single points of failure. Adopting the non-relational model in general is not easy, and Netflix has been paying a steep pioneer tax while integrating these rapidly evolving and still maturing NoSQL products. There is a learning curve and an operational overhead. Still, the scalability, availability and performance advantages of the NoSQL persistence model are evident and are paying for themselves already, and will be central to our long-term cloud strategy.



Summarizing the pros for each of the 3 solutions:



  • Amazon SimpleDB Pros


    • highly durable, writes spanning multiple availability zones
    • handy query and data formats
    • batch operations
    • consistent reads
    • hosted solution

  • HBase Pros


    • dynamic partitioning model
    • built-in support for compression
    • range queries
    • support for distributed counters
    • strong consistency
    • interoperability with Hadoop

  • Cassandra Pros


    • no dedicated name nodes
    • no practical architectural limitations on data sizes, row/column counts, etc.
    • flexible data model
    • no underlying storage format requirements like HDFS
    • uniquely flexible consistency and replication models
    • cross-datacenter and cross-regional replication

I hope the next post will be about the “small” issues Netflix ran into when adopting each of these systems. In the past they’ve shared some of the challenges of an Oracle - Amazon SimpleDB hybrid solution.





  1. Yury Izrailevsky: Netflix Director of Cloud and Systems Infrastructure
     




Original title and link: Why Netflix Picked Amazon SimpleDB, Hadoop/HBase, and Cassandra (NoSQL databases © myNoSQL)



"

No comments:

Post a Comment