What does "Document-oriented" vs. Key-Value mean when talking about MongoDB vs Cassandra?
What does going with a document based NoSQL option buy you over a KV store, and vice-versa?
A key-value store provides the simplest possible data model and is exactly what the name suggests: it's a storage system that stores values indexed by a key. You're limited to query by key and the values are opaque, the store doesn't know anything about them. This allows very fast read and write operations (a simple disk access) and I see this model as a kind of non volatile cache (i.e. well suited if you need fast accesses by key to long-lived data).
A document-oriented database extends the previous model and values are stored in a structured format (a document, hence the name) that the database can understand. For example, a document could be a blog post and the comments and the tags stored in a denormalized way. Since the data are transparent, the store can do more work (like indexing fields of the document) and you're not limited to query by key. As I hinted, such databases allows to fetch an entire page's data with a single query and are well suited for content oriented applications (which is why big sites like Facebook or Amazon like them).
Other kinds of NoSQL databases include column-oriented stores, graph databases and even object databases. But this goes beyond the question.
See also
- Comparing Document Databases to Key-Value Stores
- Analysis of the NoSQL Landscape
Well, I've been investigating NoSQL myself the past month or so. I think it generally could be stated something like
- KV stores doesnt know of the value content actually stored for a key
- Document based lets you define secondary indexes within the value content, as the db knows the document structure (e.g. tags of a blog post).
- NoSQL solutions each have specific features which should be taken into consideration, such as
- Special datatypes in a KV store (e.g. sets with left/right pop/push like in redis)
- easy scale up/down cluster as riak says it has (I havent tried it ... yet)
- pluggable data store as in Voldemort
- build-in web configuration and web app support like in CouchDB / couchapp