How to stop insertion of Duplicate documents in a mongodb collection
Let us have a MongoDB
collection which has three docs..
db.collection.find()
{ _id:'...', user: 'A', title: 'Physics', Bank: 'Bank_A' }
{ _id:'...', user: 'A', title: 'Chemistry', Bank: 'Bank_B' }
{ _id:'...', user: 'B', title: 'Chemistry', Bank: 'Bank_A' }
We have a doc,
doc = { user: 'B', title: 'Chemistry', Bank:'Bank_A' }
If we use
db.collection.insert(doc)
here, this duplicate doc will get inserted in database.
{ _id:'...', user: 'A', title: 'Physics', Bank: 'Bank_A' }
{ _id:'...', user: 'A', title: 'Chemistry', Bank: 'Bank_B' }
{ _id:'...', user: 'B', title: 'Chemistry', Bank: 'Bank_A' }
{ _id:'...', user: 'B', title: 'Chemistry', Bank: 'Bank_A' }
How this duplicate can be stopped. On which field should indexing be done or any other approach?
Don't use insert.
Use update with upsert=true
. Update will look for the document that matches your query, then it will modify the fields you want and then, you can tell it upsert:True if you want to insert if no document matches your query.
db.collection.update(
<query>,
<update>,
{
upsert: <boolean>,
multi: <boolean>,
writeConcern: <document>
}
)
So, for your example, you could use something like this:
db.collection.update(doc, doc, {upsert:true})
You should use a compound index on the set of fields that uniquely identify a document within your MongoDB collection. For example, if you decide that the combination of user, title and Bank are your unique key you would issue the following command:
db.collection.createIndex( { user: 1, title: 1, Bank: 1 }, {unique:true} )
Please note that this should be done after you have removed previously stored duplicates.
http://docs.mongodb.org/manual/tutorial/create-a-compound-index/
http://docs.mongodb.org/manual/tutorial/create-a-unique-index/
It has been updated from the above answers.
please use db.collection.updateOne()
instead of db.collection.update()
.
and also db.collection.createIndexes()
instead of db.collection.ensureIndex()
Update:
the methods update() and ensureIndex() has been deprecated from mongodb 2.*, you can see more details in mongo and the path is ./mongodb/lib/collection.js
.
For update()
, the recommend methods are updateOne, updateMany, or bulkWrite
.
For ensureIndex()
, the recommend method is createIndexes
.