Aggregate $lookup with C#
I have the following MongoDb query working:
db.Entity.aggregate(
[
{
"$match":{"Id": "12345"}
},
{
"$lookup": {
"from": "OtherCollection",
"localField": "otherCollectionId",
"foreignField": "Id",
"as": "ent"
}
},
{
"$project": {
"Name": 1,
"Date": 1,
"OtherObject": { "$arrayElemAt": [ "$ent", 0 ] }
}
},
{
"$sort": {
"OtherObject.Profile.Name": 1
}
}
]
)
This retrieves a list of objects joined with a matching object from another collection.
Does anybody know how I can use this in C# using either LINQ or by using this exact string?
I tried using the following code but it can't seem to find the types for QueryDocument
and MongoCursor
- I think they've been deprecated?
BsonDocument document = MongoDB.Bson.Serialization.BsonSerializer.Deserialize<BsonDocument>("{ name : value }");
QueryDocument queryDoc = new QueryDocument(document);
MongoCursor toReturn = _connectionCollection.Find(queryDoc);
Solution 1:
There is no need to parse the JSON. Everything here can actually be done directly with either LINQ or the Aggregate Fluent interfaces.
Just using some demonstration classes because the question does not really give much to go on.
Setup
Basically we have two collections here, being
entities
{ "_id" : ObjectId("5b08ceb40a8a7614c70a5710"), "name" : "A" }
{ "_id" : ObjectId("5b08ceb40a8a7614c70a5711"), "name" : "B" }
and others
{
"_id" : ObjectId("5b08cef10a8a7614c70a5712"),
"entity" : ObjectId("5b08ceb40a8a7614c70a5710"),
"name" : "Sub-A"
}
{
"_id" : ObjectId("5b08cefd0a8a7614c70a5713"),
"entity" : ObjectId("5b08ceb40a8a7614c70a5711"),
"name" : "Sub-B"
}
And a couple of classes to bind them to, just as very basic examples:
public class Entity
{
public ObjectId id;
public string name { get; set; }
}
public class Other
{
public ObjectId id;
public ObjectId entity { get; set; }
public string name { get; set; }
}
public class EntityWithOthers
{
public ObjectId id;
public string name { get; set; }
public IEnumerable<Other> others;
}
public class EntityWithOther
{
public ObjectId id;
public string name { get; set; }
public Other others;
}
Queries
Fluent Interface
var listNames = new[] { "A", "B" };
var query = entities.Aggregate()
.Match(p => listNames.Contains(p.name))
.Lookup(
foreignCollection: others,
localField: e => e.id,
foreignField: f => f.entity,
@as: (EntityWithOthers eo) => eo.others
)
.Project(p => new { p.id, p.name, other = p.others.First() } )
.Sort(new BsonDocument("other.name",-1))
.ToList();
Request sent to server:
[
{ "$match" : { "name" : { "$in" : [ "A", "B" ] } } },
{ "$lookup" : {
"from" : "others",
"localField" : "_id",
"foreignField" : "entity",
"as" : "others"
} },
{ "$project" : {
"id" : "$_id",
"name" : "$name",
"other" : { "$arrayElemAt" : [ "$others", 0 ] },
"_id" : 0
} },
{ "$sort" : { "other.name" : -1 } }
]
Probably the easiest to understand since the fluent interface is basically the same as the general BSON structure. The $lookup
stage has all the same arguments and the $arrayElemAt
is represented with First()
. For the $sort
you can simply supply a BSON document or other valid expression.
An alternate is the newer expressive form of $lookup
with a sub-pipeline statement for MongoDB 3.6 and above.
BsonArray subpipeline = new BsonArray();
subpipeline.Add(
new BsonDocument("$match",new BsonDocument(
"$expr", new BsonDocument(
"$eq", new BsonArray { "$$entity", "$entity" }
)
))
);
var lookup = new BsonDocument("$lookup",
new BsonDocument("from", "others")
.Add("let", new BsonDocument("entity", "$_id"))
.Add("pipeline", subpipeline)
.Add("as","others")
);
var query = entities.Aggregate()
.Match(p => listNames.Contains(p.name))
.AppendStage<EntityWithOthers>(lookup)
.Unwind<EntityWithOthers, EntityWithOther>(p => p.others)
.SortByDescending(p => p.others.name)
.ToList();
Request sent to server:
[
{ "$match" : { "name" : { "$in" : [ "A", "B" ] } } },
{ "$lookup" : {
"from" : "others",
"let" : { "entity" : "$_id" },
"pipeline" : [
{ "$match" : { "$expr" : { "$eq" : [ "$$entity", "$entity" ] } } }
],
"as" : "others"
} },
{ "$unwind" : "$others" },
{ "$sort" : { "others.name" : -1 } }
]
The Fluent "Builder" does not support the syntax directly yet, nor do LINQ Expressions support the $expr
operator, however you can still construct using BsonDocument
and BsonArray
or other valid expressions. Here we also "type" the $unwind
result in order to apply a $sort
using an expression rather than a BsonDocument
as shown earlier.
Aside from other uses, a primary task of a "sub-pipeline" is to reduce the documents returned in the target array of $lookup
. Also the $unwind
here serves a purpose of actually being "merged" into the $lookup
statement on server execution, so this is typically more efficient than just grabbing the first element of the resulting array.
Queryable GroupJoin
var query = entities.AsQueryable()
.Where(p => listNames.Contains(p.name))
.GroupJoin(
others.AsQueryable(),
p => p.id,
o => o.entity,
(p, o) => new { p.id, p.name, other = o.First() }
)
.OrderByDescending(p => p.other.name);
Request sent to server:
[
{ "$match" : { "name" : { "$in" : [ "A", "B" ] } } },
{ "$lookup" : {
"from" : "others",
"localField" : "_id",
"foreignField" : "entity",
"as" : "o"
} },
{ "$project" : {
"id" : "$_id",
"name" : "$name",
"other" : { "$arrayElemAt" : [ "$o", 0 ] },
"_id" : 0
} },
{ "$sort" : { "other.name" : -1 } }
]
This is almost identical but just using the different interface and produces a slightly different BSON statement, and really only because of the simplified naming in the functional statements. This does bring up the other possibility of simply using an $unwind
as produced from a SelectMany()
:
var query = entities.AsQueryable()
.Where(p => listNames.Contains(p.name))
.GroupJoin(
others.AsQueryable(),
p => p.id,
o => o.entity,
(p, o) => new { p.id, p.name, other = o }
)
.SelectMany(p => p.other, (p, other) => new { p.id, p.name, other })
.OrderByDescending(p => p.other.name);
Request sent to server:
[
{ "$match" : { "name" : { "$in" : [ "A", "B" ] } } },
{ "$lookup" : {
"from" : "others",
"localField" : "_id",
"foreignField" : "entity",
"as" : "o"
}},
{ "$project" : {
"id" : "$_id",
"name" : "$name",
"other" : "$o",
"_id" : 0
} },
{ "$unwind" : "$other" },
{ "$project" : {
"id" : "$id",
"name" : "$name",
"other" : "$other",
"_id" : 0
}},
{ "$sort" : { "other.name" : -1 } }
]
Normally placing an $unwind
directly following $lookup
is actually an "optimized pattern" for the aggregation framework. However the .NET driver does mess this up in this combination by forcing a $project
in between rather than using the implied naming on the "as"
. If not for that, this is actually better than the $arrayElemAt
when you know you have "one" related result. If you want the $unwind
"coalescence", then you are better off using the fluent interface, or a different form as demonstrated later.
Querable Natural
var query = from p in entities.AsQueryable()
where listNames.Contains(p.name)
join o in others.AsQueryable() on p.id equals o.entity into joined
select new { p.id, p.name, other = joined.First() }
into p
orderby p.other.name descending
select p;
Request sent to server:
[
{ "$match" : { "name" : { "$in" : [ "A", "B" ] } } },
{ "$lookup" : {
"from" : "others",
"localField" : "_id",
"foreignField" : "entity",
"as" : "joined"
} },
{ "$project" : {
"id" : "$_id",
"name" : "$name",
"other" : { "$arrayElemAt" : [ "$joined", 0 ] },
"_id" : 0
} },
{ "$sort" : { "other.name" : -1 } }
]
All pretty familiar and really just down to functional naming. Just as with using the $unwind
option:
var query = from p in entities.AsQueryable()
where listNames.Contains(p.name)
join o in others.AsQueryable() on p.id equals o.entity into joined
from sub_o in joined.DefaultIfEmpty()
select new { p.id, p.name, other = sub_o }
into p
orderby p.other.name descending
select p;
Request sent to server:
[
{ "$match" : { "name" : { "$in" : [ "A", "B" ] } } },
{ "$lookup" : {
"from" : "others",
"localField" : "_id",
"foreignField" : "entity",
"as" : "joined"
} },
{ "$unwind" : {
"path" : "$joined", "preserveNullAndEmptyArrays" : true
} },
{ "$project" : {
"id" : "$_id",
"name" : "$name",
"other" : "$joined",
"_id" : 0
} },
{ "$sort" : { "other.name" : -1 } }
]
Which actually is using the "optimized coalescence" form. The translator still insists on adding a $project
since we need the intermediate select
in order to make the statement valid.
Summary
So there are quite a few ways to essentially arrive at what is basically the same query statement with exactly the same results. Whilst you "could" parse the JSON to BsonDocument
form and feed this to the fluent Aggregate()
command, it's generally better to use the natural builders or the LINQ interfaces as they do easily map onto the same statement.
The options with $unwind
are largely shown because even with a "singular" match that "coalescence" form is actually far more optimal then using $arrayElemAt
to take the "first" array element. This even becomes more important with considerations of things like the BSON Limit where the $lookup
target array could cause the parent document to exceed 16MB without further filtering. There is another post here on Aggregate $lookup Total size of documents in matching pipeline exceeds maximum document size where I actually discuss how to avoid that limit being hit by using such options or other Lookup()
syntax available to the fluent interface only at this time.