When to use MongoDB [closed]
I'm writing an application that doesn't necessarily need scaling abilities as it won't be collecting large amounts data at the beginning. (However, if I'm lucky, I could down the road potentially.)
I will be running my web server and database on the same box (for now).
That being said, I am looking for performance and efficiency.
The main part of my application will be loading blog articles. Using an RDBMS (MySQL) I will make 6 queries (2 of the queries being joins), just to load a single blog article page.
select blog
select blog_album
select blog_tags
select blog_notes
select blog_comments (join with users)
select blog_author_participants (join with users)
However, with MongoDB I can de-normalize and flatten 6 tables into just 2 tables/collections and minimizes my queries to potentially just one 1 query,
users
blogs
->blog_album
->blog_tags
->blog_notes
->blog_comments
->blog_author_participants
Now, going with the MongoDB schema, there will be some data redundancy. However, hard drive space is cheaper than CPU/servers.
1.) Would this be a good scenario to use MongoDB?
2.) Do you only benefit in performance using MongoDB when scaling beyond a single server?
3.) Are there any durability risks using MongoDB? I hear that there is potential for loss of data while performing inserts - as insert are written to memory first, then to the database.
4.) Should this stop me from using MongoDB in production?
You would use MongoDB when you have a use case that matches its strengths.
Do you need a schema-less document store? Nope, you have a stable schema.
Do you need automatic sharding? Nope, you don't have extraordinary data needs or budget for horizontally scaling hardware.
Do you need map/reduce data processing? Not for something like a blog.
So why are you even considering it?
However, with MongoDB I can de-normalize and flatten 6 tables into just 2 tables/collections and minimizes my queries to potentially just one 1 query
But you can easily query MySQL for 6 tables worth of information related to a single blog post with a single properly crafted SQL statement.
however hard drive space is cheaper than CPU/servers.
If performance and scaling is a priority then you are going to be concerned with having enough RAM to fit everything into main memory and enough CPU cores to run queries. An enterprise grade RAID 10 array is a requirement, don't get me wrong, but as soon as your database software (MongoDB or MySQL) needs to scan an index that can't fit into main memory you'll be in for a world of pain assuming a large active database. :)
I like MongoDB, but it's big strength in my mind is map/reduce and its document-orientation. You require neither of those features. MySQL is time-tested in large scale deployments and supports partitioning (but I would argue that your database would have to be in the order of 50-100 GB before you can realize substantial gain from partitioning vs a single (plus passive backup) server with tons (64 GB+) of RAM. I would also argue that if performance is truly a concern then MySQL would be preferable as you would have supreme control over your indexes.
That's not to say that MongoDB isn't high performance, but its place probably isn't serving blogs. Your concern with inserts is valid as well. MongoDB is not an ACID system. Google transactions in both systems and compare.
Here is a good explanation: http://mod.erni.st/nosql-if-only-it-was-that-easy/
The last paragraph summarizes it:
What am I going to build my next app on? Probably Postgres. Will I use NoSQL? Maybe. I might also use Hadoop and Hive. I might keep everything in flat files. Maybe I’ll start hacking on Maglev. I’ll use whatever is best for the job. If I need reporting, I won’t be using any NoSQL. If I need caching, I’ll probably use Tokyo Tyrant. If I need ACIDity, I won’t use NoSQL. If I need a ton of counters, I’ll use Redis. If I need transactions, I’ll use Postgres. If I have a ton of a single type of documents, I’ll probably use Mongo. If I need to write 1 billion objects a day, I’d probably use Voldemort. If I need full text search, I’d probably use Solr. If I need full text search of volatile data, I’d probably use Sphinx.
NoSQL vs. RDBMS: Apples and Oranges?
I would advise you to read up a little on what NoSQL is and what it does before you decide whether you can use it. You can't take a normal database and turn it into a NoSQL thing just like that. The way you work with the data is completely different.
NoSQL definitely has its uses. But it's definitely not the answer for everything. The main advantage of NoSQL is the easily changeable data model.
Advantages of using mongodb ( as per Moshe Kaplan
published in dzone
article)
- Schema-less design
- Scalability in managing Tera bytes of data
- Rapid replicaSet with high availability feature
- Sharding enables linear and scale out growth w/o running out of budget
- Support high write load
- Use of Data locality for query processing
MongoDB meets Consistency
& Partitioning
requirements in CAP theory ( Consistency, Availability and Partitioning)
Related SE questions:
What are the advantages of using a schema-free database like MongoDB compared to a relational database?
When to Redis? When to MongoDB?