Mongo how to $lookup with DBRef
I have a trouble(/(ㄒoㄒ)/~~). Suppose that collection A is
{
"_id" : ObjectId("582abcd85d2dfa67f44127e1"),
"bid" : [
DBRef("B", ObjectId("582abcd85d2dfa67f44127e0")),
DBRef("B", ObjectId("582abcd85d2dfa67f44127e1"))
]
}
and Collection B:
{
"_id" : ObjectId("582abcd85d2dfa67f44127e0"),
"status" : NumberInt(1),
"seq" : NumberInt(0)
},
{
"_id" : ObjectId("582abcd85d2dfa67f44127e1"),
"status" : NumberInt(1),
"seq" : NumberInt(0)
}
I don't know how to $lookup the 'bid'. I tried
db.A.aggregate(
[
{$unwind: {path: "$bid"}},
{$lookup: {from: "B", localField: "bid", foreignField: "_id", as: "bs"}},
]
)
and
db.A.aggregate(
[
{$unwind: {path: "$bid"}},
{$lookup: {from: "B", localField: "bid.$id", foreignField: "_id", as: "bs"}},
]
)
but it doesn't work. Anybody can help? Thanks.
Solution 1:
Actually, the other answer is wrong. It is possible to do a lookup on a DBref field within your aggregator, and you don't need mapreduce for that.
Solution
db.A.aggregate([
{
$project: {
B_fk: {
$map: {
input: {
$map: {
input:"$bid",
in: {
$arrayElemAt: [{$objectToArray: "$$this"}, 1]
},
}
},
in: "$$this.v"}},
}
},
{
$lookup: {
from:"B",
localField:"B_fk",
foreignField:"_id",
as:"B"
}
}
])
result
{
"_id" : ObjectId("59bb79df1e9c00162566f581"),
"B_fk" : null,
"B" : [ ]
},
{
"_id" : ObjectId("582abcd85d2dfa67f44127e1"),
"B_fk" : [
ObjectId("582abcd85d2dfa67f44127e0"),
ObjectId("582abcd85d2dfa67f44127e1")
],
"B" : [
{
"_id" : ObjectId("582abcd85d2dfa67f44127e0"),
"status" : NumberInt("1"),
"seq" : NumberInt("0")
}
]
}
Short Explanation
Loop through the DBRefs with $map, break each DBref into an array, keep only the $id field, then get rid of the k:v format with $$this.v, keeping only the ObjectId and removing all the rest. You can now lookup on the ObjectId.
Step-by-step Explanation
Within the aggregator, a DBRef BSON type can be handled like an object, with two or three fields (ref, id, and db).
If you do:
db.A.aggregate([
{
$project: {
First_DBref_as_array: {$objectToArray:{$arrayElemAt:["$bid",0]}},
Second_DBref_as_array: {$objectToArray:{$arrayElemAt:["$bid",1]}},
}
},
])
This is the result:
{
"_id" : ObjectId("582abcd85d2dfa67f44127e1"),
"First_DBref_as_array : [
{
"k" : "$ref",
"v" : "B"
},
{
"k" : "$id",
"v" : ObjectId("582abcd85d2dfa67f44127e0")
}
],
"Second_DBref_as_array" : [
{
"k" : "$ref",
"v" : "B"
},
{
"k" : "$id",
"v" : ObjectId("582abcd85d2dfa67f44127e0")
}
]
}
Once you have transformed a dbref into an array, you can get rid of the useless fields by querying only the value at index 1, like this:
db.A.aggregate([
{
$project: {
First_DBref_as_array: {$arrayElemAt: [{$objectToArray:{$arrayElemAt:["$bid",0]}},1]},
Second_DBref_as_array: {$arrayElemAt: [{$objectToArray:{$arrayElemAt:["$bid",0]}},1]},
}
},
])
result:
{
"_id" : ObjectId("582abcd85d2dfa67f44127e1"),
"First_DBref_as_array" : {
"k" : "$id",
"v" : ObjectId("582abcd85d2dfa67f44127e0")
},
"Second_DBref_as_array" : {
"k" : "$id",
"v" : ObjectId("582abcd85d2dfa67f44127e0")
}
}
Then you can get finally to the value you want by pointing to "$myvalue.v", just like this
db.A.aggregate([
{
$project: {
first_DBref_as_array: {$arrayElemAt: [{$objectToArray:{$arrayElemAt:["$bid",0]}},1]},
second_DBref_as_array: {$arrayElemAt: [{$objectToArray:{$arrayElemAt:["$bid",0]}},1]},
}
},
{
$project: {
first_DBref_as_ObjectId: "$first_DBref_as_array.v",
second_DBref_as_ObjectId: "$second_DBref_as_array.v"
}
}
])
result:
{
"_id" : ObjectId("582abcd85d2dfa67f44127e1"),
"first_DBref_as_ObjectId" : ObjectId("582abcd85d2dfa67f44127e0"),
"second_DBref_as_ObjectId" : ObjectId("582abcd85d2dfa67f44127e0")
}
Obviously, in a normal pipeline, you don't need all these redundant steps, using a nested $map, you can get to the same result in one go :
db.A.aggregate([
{
$project: {
B_fk: { $map : {input: { $map: { input:"$bid",
in: { $arrayElemAt: [{$objectToArray: "$$this"}, 1 ]}, } },
in: "$$this.v"}},
}
},
])
result:
{
"_id" : ObjectId("582abcd85d2dfa67f44127e1"),
"B_fk" : [
ObjectId("582abcd85d2dfa67f44127e0"),
ObjectId("582abcd85d2dfa67f44127e1")
]
}
I hope the explanation is clear enough, if not feel free to ask.
Solution 2:
As of mongoDB 3.4, this is not possible. You can't use DBRef in the aggregation pipeline, except in the $match stage.
I strongly recommend you to get rid of DBRef and switch to manual references. However, if you really need to keep DBRef, here is an (ugly) solution:
first, create a new collection named "C", where DBRefs are replaced by their Ids using mapReduce:
db.A.mapReduce(
function() {
var key = this._id;
var value = [];
for ( var index = 0; index < this.bid.length; index++){
value.push(this.bid[index].$id);
}
emit(key, value);
},
function(key,values) {
return values;
},
{
"query": {},
"out": "C"
}
)
then, run your aggregation query on the new "C" collection:
db.C.aggregate([
{
$unwind:"$value"
},
{
$lookup:{
from:"B",
localField:"value",
foreignField:"_id",
as:"bs"
}
}
]);
output:
{
"_id":ObjectId("582abcd85d2dfa67f44127e1"),
"value":ObjectId("582abcd85d2dfa67f44127e0"),
"bs":[
{
"_id":ObjectId("582abcd85d2dfa67f44127e0"),
"status":1,
"seq":0
}
]
}{
"_id":ObjectId("582abcd85d2dfa67f44127e1"),
"value":ObjectId("582abcd85d2dfa67f44127e1"),
"bs":[
{
"_id":ObjectId("582abcd85d2dfa67f44127e1"),
"status":1,
"seq":0
}
]
}