MongoDB - Find documents with earliest occurrence of duplicate value

Solution 1:

You can do the followings in an aggregation pipeline:

$unwind links so the documents are in links level
$sort on isoDate to get the first document
$group by links to get count inbetween group and the id of the first document. In your example, title is taken as unique identifier.
$match with count > 1 to get title that share the same link
$group to dedupe the unique identifier we found in step 3
$lookup back the original document and do some cosmetics by $replaceRoot

db.collection.aggregate([
  {
    "$unwind": "$links"
  },
  {
    $sort: {
      isoDate: 1
    }
  },
  {
    $group: {
      _id: "$links",
      first: {
        $first: "$title"
      },
      count: {
        $sum: 1
      }
    }
  },
  {
    $match: {
      count: {
        $gt: 1
      }
    }
  },
  {
    $group: {
      _id: "$first"
    }
  },
  {
    "$lookup": {
      "from": "collection",
      "localField": "_id",
      "foreignField": "title",
      "as": "rawDocument"
    }
  },
  {
    "$unwind": "$rawDocument"
  },
  {
    "$replaceRoot": {
      "newRoot": "$rawDocument"
    }
  }
])

Here is the Mongo playground for your reference.

MongoDB - Find documents with earliest occurrence of duplicate value

Solution 1:

Related

Recent Posts