Mongodb: Get sample of nested array using Aggreagate

Solution 1:

Change the last $addFields stage to this.

Pros: It "works."

Cons: You are not guaranteed unique random entries in the list. To get that is a lot more work. If you have a LOT more friends than the range then you are probably OK.

    ,{$addFields: {friends: {$reduce: { // overwrite friends array...
        // $range is the number of things you want to pick:                                            
        input: {$range:[0,4]},
        initialValue: [],
        in: {
            $let: {
                // qq will be a random # between 0 and size-1 thanks to mult                           
                // and floor, so we do not have to do qq-1 to get to zero-based                        
                // indexing on the $friends array                                                      
                vars: {qq: {$floor:{$multiply:[{$rand: {}},{$size:"$friends"}]}} },

                // $concat only works for strings, but $concatArrays can be used                       
                // (creatively) on other types. Here $slice returns an array of                        
                // 1 item which we easily pass to $concatArrays to build the                           
                // the overall result:                                                                 
                in: {$concatArrays: [ "$$value", {$slice:["$friends","$$qq",1]} ]}
            }}
    }}

UPDATED

This version exploits keeping state in the $reduce chain and will not pick dupes. It does so by iteratively shrinking the input candidate list of items as each item is randomly chosen. The output is a little nested (i.e. friends is not set to picked random sample but rather to an object containing picks and the remnant aa list) but this is something easily reformatted after the fact. In MongoDB 5.0 we could finish it off with:

    {$addFields: {friends: {$getField: {field: "$friends.picks", input: {$reduce: {

but many people are not yet on 5.0.

    {$addFields: {friends: {$reduce: {
        // $range is the number of things you want to pick:
        input: {$range:[0,6]},

        // This is classic use of $reduce to iterate over something AND
        // preserve state.  We start with picks as empty and aa being the
        // original friends array:
        initialValue: {aa: "$friends", picks: []},

        in: {
            $let: {
                // idx will be a random # between 0 and size-1 thanks to mult
                // and floor, so we do not have to do idx-1 to get to zero-based
                // indexing on the $friends array.  idx and sz will be eval'd
                // each time reduce turns the crank through the input range:
                vars: {idx: {$floor:{$multiply:[{$rand: {}},{$size:"$$value.aa"}]}},
                       // cannot set sz and then use it in same vars; oh well
                       sz: {$size:"$$value.aa"}
                      },

                in: {
                    // Add to our picks list:
                    picks: {$concatArrays: [ "$$value.picks", {$slice:["$$value.aa","$$idx",1]} ]},

                    // And now shrink up the input candidate array.                                    
                    // Sadly, we cannot do $slice:[array,pos,0] to yield an empty
                    // array and keep the $concat logic tight; thus we have to test
                    // for front and end special conditions.
                    // This whole bit is to extract the chosen item from the aa
                    // array by splicing together a new one MINUS the target.
                    // This will change the value of $sz (-1) as we crank thru
                    // the picks.  This ensures we only pick UNPICKED items from
                    // $$value.aa!
                    
                    aa: {$cond: [{$eq:["$$idx",0]}, // if

                         // idx 0: Take from idx 1 and count size - 1:
                         {$slice:["$$value.aa",1,{$subtract:["$$sz",1]}]}, // then

                         // idx last: Take from idx 0 and ALSO count size - 1:
                         {$cond: [ // else
                             {$eq:["$$idx",{$subtract:["$$sz",1]}]}, // if
                             {$slice:["$$value.aa",0,{$subtract:["$$sz",1]}]}, // then

                             // else not 0 or last item, i.e. idx = 3
                             {$concatArrays: [
                                 // Start at 0, count idx; this will land
                                 // us BEFORE the target item (because idx
                                 // is n-1:
                                 {$slice:["$$value.aa",0,"$$idx"]},

                                 // Jump over the target (+1), and go n-2
                                 // (1 for idx/n conversion, and 1 for the
                                 // fact we jumped over:
                                 {$slice:["$$value.aa",{$add:["$$idx",1]},{$subtract:["$$sz",2]}]}
                             ]}
                         ]}
                    ]}
                }
            }}
        }}
    }}

]);

Starting in MongoDB v4.4 (Jan 2021), you may opt to use the $function operator. The splice function in javascript does all the work of the multiple $slice operations in the previous example.

    {$addFields: {friends: {$function: {
        body: function(candidates, npicks) {
            var picks = []
            for(var i = 0; i < npicks; i++) {
                var idx = Math.floor(Math.random() * candidates.length);
                picks.push(candidates.splice(idx,1)[0]);
            }
            return picks;
        },
        args: [ "$friends", 4], // 4 is num to pick                                                    
        lang: "js"
    }}