MongoDB - Find documents with earliest occurrence of duplicate value
Solution 1:
You can do the followings in an aggregation pipeline:
-
$unwind
links
so the documents are in links level -
$sort
onisoDate
to get the first document -
$group
bylinks
to get count inbetween group and the id of the first document. In your example, title is taken as unique identifier. -
$match
with count > 1 to gettitle
that share the same link -
$group
to dedupe the unique identifier we found in step 3 -
$lookup
back the original document and do some cosmetics by$replaceRoot
db.collection.aggregate([
{
"$unwind": "$links"
},
{
$sort: {
isoDate: 1
}
},
{
$group: {
_id: "$links",
first: {
$first: "$title"
},
count: {
$sum: 1
}
}
},
{
$match: {
count: {
$gt: 1
}
}
},
{
$group: {
_id: "$first"
}
},
{
"$lookup": {
"from": "collection",
"localField": "_id",
"foreignField": "title",
"as": "rawDocument"
}
},
{
"$unwind": "$rawDocument"
},
{
"$replaceRoot": {
"newRoot": "$rawDocument"
}
}
])
Here is the Mongo playground for your reference.