Mongodb aggregation pipeline how to limit a group push
I am not able to limit the amount of pushed elements in a group function with aggregation pipeline. Is this possible? Small example:
Data:
[
{
"submitted": date,
"loc": { "lng": 13.739251, "lat": 51.049893 },
"name": "first",
"preview": "my first"
},
{
"submitted": date,
"loc": { "lng": 13.639241, "lat": 51.149883 },
"name": "second",
"preview": "my second"
},
{
"submitted": date,
"loc": { "lng": 13.715422, "lat": 51.056384 },
"name": "nearpoint2",
"preview": "my nearpoint2"
}
]
Here is my aggregation pipeline:
var pipeline = [
//I want to limit the data to a certain area
{ $match: {
loc: {
$geoWithin: {
$box: [
[locBottomLeft.lng, locBottomLeft.lat],
[locUpperRight.lng, locUpperRight.lat]
]
}
}
}},
// I just want to get the latest entries
{ $sort: { submitted: -1 } },
// I group by name
{
$group: {
_id: "$name",
// get name
submitted: { $max: "$submitted" },
// get the latest date
locs: { $push: "$loc" },
// push every loc into an array THIS SHOULD BE LIMITED TO AN AMOUNT 5 or 10
preview: { $first: "$preview" }
}
},
// Limit the query to at least 10 entries.
{ $limit: 10 }
];
How can I limit the locs
array to 10
or any other size? I tried something with $each
and $slice
but that does not seem to work.
Solution 1:
Suppose the bottom left coordinates and the upper right coordinates are respectively [0, 0]
and [100, 100]
. From MongoDB 3.2 you can use the $slice
operator to return a subset of an array which is what you want.
db.collection.aggregate([
{ "$match": {
"loc": {
"$geoWithin": {
"$box": [
[0, 0],
[100, 100]
]
}
}}
}},
{ "$group": {
"_id": "$name",
"submitted": { "$max": "$submitted" },
"preview": { "$first": "$preview" }
"locs": { "$push": "$loc" }
}},
{ "$project": {
"locs": { "$slice": [ "$locs", 5 ] },
"preview": 1,
"submitted": 1
}},
{ "$limit": 10 }
])
Solution 2:
Starting in Mongo 5.2
, it's a perfect use case for the new $topN
aggregation accumulator:
// { submitted: ISODate("2021-12-05"), group: "group1", value: "plop" }
// { submitted: ISODate("2021-12-07"), group: "group2", value: "smthg" }
// { submitted: ISODate("2021-12-06"), group: "group1", value: "world" }
// { submitted: ISODate("2021-12-12"), group: "group1", value: "hello" }
db.collection.aggregate([
{ $group: {
_id: "$group",
top: { $topN: { n: 2, sortBy: { submitted: -1 }, output: "$value" } }
}}
])
// { _id: "group1", top: [ "hello", "world" ] }
// { _id: "group2", top: [ "smthg" ] }
This applies a $topN
group accumulation that:
- takes for each group the top 2 (
n: 2
) elements - top 2, as defined by
sortBy: { submitted: -1 }
(reversed chronological) - and for each grouped record extracts the field
value
(output: "$value"
)