Saturday, 12 June 2010

Comparing Document Databases to Key-Value Stores

Comparing Document Databases to Key-Value Stores: "

Oren Eini has an interesting ☞ post emphasizing the main differences between document databases (f.e. CouchDB, MongoDB, etc.) and key-value stores (f.e. Redis, Project Voldemort, Tokyo Cabinet):




The major benefit of using a document database comes from the fact that while it has all the benefits of a key/value store, you aren’t limited to just querying by key.




One of the main advantages of data transparency (as opposed to opaque data) is that the engine will be able to perform additional work without having to translate the data into an intermediary or a format that it understands. Querying by non primary key is such an example. The various document stores provide different implementation flavors depending on index creation time, index update strategy, etc. Oren goes on and covers the query behavior for CouchDB, Raven and MongoDB:




In the first case (nb indexes prepared ahead of time), you define an indexing function (in Raven’s case, a Linq query, in CouchDB case, a JavaScript function) and the server will run it to prepare the results, once the results are prepared, they can be served to the client with minimal computation. CouchDB and Raven differs in the method they use to update those indexes, Raven will update the index immediately on document change, and queries to indexes will never wait. […] With CouchDB, a view is updated at view query time, which may lead to a long wait time on the first time a view is accessed if there were a lot of changes in the meanwhile. […]



Note that in both CouchDB and Raven’s cases, indexes do not affect write speed, since in both cases this is done at a background task.



MongoDB, on the other hand, allows ad-hoc querying, and relies on indexes defined on the document values to help it achieve reasonable performance when the data size grows large enough. MongoDB’s indexes behave in much the same way RDBMS indexes behave, that is, they are updated as part or the insert process, so large number of indexes is going to affect write performance.




Another good resource explaining the differences between MongoDB and CouchDB queries is Rick Osbourne’s ☞ article.



After RavenDB made his appearance in the NoSQL space we’ll probably have to compare it to the existing CouchDB and MongoDB features.



This is not to say that some of this functionality cannot be achieved with pure key-value stores, but these seem to be focused mainly on single/multi key lookups and most probably you’ll have to build this additional layer by yourself.



"

No comments:

Post a Comment