MongoDB - get documents with max attribute per group in a collection
My data looks like this:
session, age, firstName, lastName
1, 28, John, Doe
1, 21, Donna, Keren
2, 32, Jenna, Haze
2, 52, Tommy, Lee
..
..
I'd like to get all the rows which are the largest (by age) per session. So So for the above input my output would look like:
sessionid, age, firstName, lastName
1, 28, John, Doe
2, 52, Tommy, Lee
because John has the largest age in the session = 1 group and Tommy has the largest age on the session=2 group.
I need to export the result to a file (csv) and it may contain lots of records.
How can I achieve this?
MongoDB aggregation offers the $max
operator, but in your case you want the "whole" record as it is. So the appropriate thing to do here is $sort
and then use the $first
operator within a $group
statement:
db.collection.aggregate([
{ "$sort": { "session": 1, "age": -1 } },
{ "$group": {
"_id": "$session",
"age": { "$first": "$age" },
"firstName": { "$first" "$firstName" },
"lastName": { "$first": "$lastName" }
}}
])
So the "sorting" gets the order right, and the "grouping" picks the first occurrence within the "grouping" key where those fields exist.
Mostly $first
here because the $sort
is done in reverse order. You can also use $last
when in an ascending order as well.