Generating a Structure for Aggregation
So here's a question. What I want to do is generate a data structure given a set of input values.
Since this is a multiple language submission, let's consider the input list to be an array of key/value pairs. And therefore an array of Hash, Map, Dictionary or whatever term that floats your boat. I'll keep all the notation here as JSON, hoping that's universal enough to translate / decode.
So for input, let's say we have this:
[ { "4": 10 }, { "7": 9 }, { "90": 7 }, { "1": 8 } ]
Maybe a little redundant, but lets stick with that.
So from that input, I want to get to this structure. I'm giving a whole structure, but the important part is what gets returned for the value under "weight":
[
{ "$project": {
"user_id": 1,
"content": 1,
"date": 1,
"weight": { "$cond": [
{ "$eq": ["$user_id": 4] },
10,
{ "$cond": [
{ "$eq": ["$user_id": 7] },
9,
{ "$cond": [
{ "$eq": ["$user_id": 90] },
7,
{ "$cond": [
{ "$eq": ["$user_id": 1] },
8,
0
]}
]}
]}
]}
}}
]
So the solution I'm looking for populates the structure content for "weight" as shown in the structure by using the input as shown.
Yes the values that look like numbers in the structure must be numbers and not strings, so whatever the language implementation, the JSON encoded version must look exactly the same.
Alternately, give me a better approach to get to the same result of assigning the weight values based on the matching user_id
.
Does anyone have an approach to this?
Would be happy with any language implementation as I think it is fair to just see how the structure can be created.
I'll try to add myself, but kudos goes to the good implementations.
Happy coding.
When I had a moment to think about this, I ran back home to perl and worked this out:
use Modern::Perl;
use Moose::Autobox;
use JSON;
my $encoder = JSON->new->pretty;
my $input = [ { 4 => 10 }, { 7 => 9 }, { 90 => 7 }, { 1 => 8 } ];
my $stack = [];
foreach my $item ( reverse @{$input} ) {
while ( my ( $key, $value ) = each %{$item} ) {
my $rec = {
'$cond' => [
{ '$eq' => [ '$user_id', int($key) ] },
$value
]
};
if ( $stack->length == 0 ) {
$rec->{'$cond'}->push( 0 );
} else {
my $last = $stack->pop;
$rec->{'$cond'}->push( $last );
}
$stack->push( $rec );
}
}
say $encoder->encode( $stack->[0] );
So the process was blindingly simple.
Go through each item in the array and get the key and value for the entry
Create a new "document" that has in array argument to the "$cond" key just two of required three entries. These are the values assigned to test the "$user_id" and the returned "weight" value.
Test the length of the outside variable for stack, and if it was empty (first time through) then push the value of
0
as seen in the last nested element to the end of the "$cond" key in the document.If there was something already there (length > 0) then take that value and push it as the third value in the "$cond" key for the document.
Put that document back as the value of stack and repeat for the next item
So there are a few things in the listing such as reversing the order of the input, which isn't required but produces a natural order in the nested output. Also, my choice for that outside "stack" was an array because the test operators seemed simple. But it really is just a singular value that keeps getting re-used, augmented and replaced.
Also the JSON printing is just there to show the output. All that is really wanted is the resulting value of stack to be merged into the structure.
Then I converted the logic to ruby, as was the language used by the OP from where I got the inspiration for how to generate this nested structure:
require 'json'
input = [ { 4 => 10 }, { 7 => 9 }, { 90 => 7 }, { 1 => 8 } ]
stack = []
input.reverse_each {|item|
item.each {|key,value|
rec = {
'$cond' => [
{ '$eq' => [ '$user_id', key ] },
value
]
}
if ( stack.length == 0 )
rec['$cond'].push( 0 )
else
last = stack.pop
rec['$cond'].push( last )
end
stack.push( rec )
}
}
puts JSON.pretty_generate(stack[0])
And then eventually into the final form to generate the pipeline that the OP wanted:
require 'json'
userWeights = [ { 4 => 10 }, { 7 => 9 }, { 90 => 7}, { 1 => 8 } ]
stack = []
userWeights.reverse_each {|item|
item.each {|key,value|
rec = {
'$cond' => [
{ '$eq' => [ '$user_id', key ] },
value
]
}
if ( stack.length == 0 )
rec['$cond'].push( 0 )
else
last = stack.pop
rec['$cond'].push( last )
end
stack.push( rec )
}
}
pipeline = [
{ '$project' => {
'user_id' => 1,
'content' => 1,
'date' => 1,
'weight' => stack[0]
}},
{ '$sort' => { 'weight' => -1, 'date' => -1 } }
]
puts JSON.pretty_generate( pipeline )
So that was a way to generate a structure to be passed into aggregate in order to apply "weights" that are specific to a user_id
and sort the results in the collection.
First thank you Neil for your help with this here, this workout great for me and it's really fast. For those who use mongoid, this is what I used to create the weight parameter where recommended_user_ids is an array:
def self.project_recommended_weight recommended_user_ids
return {} unless recommended_user_ids.present?
{:weight => create_weight_statement(recommended_user_ids.reverse)}
end
def self.create_weight_statement recommended_user_ids, index=0
return 0 if index == recommended_user_ids.count
{"$cond" => [{ "$eq" => ["$user_id", recommended_user_ids[index]] },index+1,create_weight_statement(recommended_user_ids,index+1)]}
end
So to add this to the pipeline simply merge the hash like this:
{"$project" => {:id => 1,:posted_at => 1}.merge(project_recommended_weight(options[:recommended_user_ids]))}