Fighting cartesian product (x-join) when using NHibernate 3.0.0

I'm bad at math but I kind get idea what cartesian product is.
Here is my situation (simplified):

public class Project{
 public IList<Partner> Partners{get;set;}
}
public class Partner{
 public IList<PartnerCosts> Costs{get;set;}
 public IList<Address> Addresses{get;set;}
}
public class PartnerCosts{
 public Money Total{get;set;}
}
public class Money{
 public decimal Amount{get;set;}
 public int CurrencyCode{get;set;}
}
public class Address{
 public string Street{get;set;}
}

My aim is to effectively load entire Project.

Problem of course is:

  • If I try to eager load partners and their costs, query returns gazillion rows
  • If I lazy load Partner.Costs, db gets request spammed (which is a bit faster than first approach)

As I read, common workaround is to use MultiQueries, but I kind a just don't get it.
So I'm hoping to learn through this exact example.

How to effectively load whole Project?

P.s. I'm using NHibernate 3.0.0.
Please, do not post answers with hql or string fashioned criteria api approaches.


Solution 1:

Ok, I wrote an example for myself reflecting your structure and this should work:

int projectId = 1; // replace that with the id you want
// required for the joins in QueryOver
Project pAlias = null;
Partner paAlias = null;
PartnerCosts pcAlias = null;
Address aAlias = null;
Money mAlias = null;

// Query to load the desired project and nothing else    
var projects = repo.Session.QueryOver<Project>(() => pAlias)
    .Where(p => p.Id == projectId)
    .Future<Project>();

// Query to load the Partners with the Costs (and the Money)
var partners = repo.Session.QueryOver<Partner>(() => paAlias)
    .JoinAlias(p => p.Project, () => pAlias)
    .Left.JoinAlias(() => paAlias.Costs, () => pcAlias)
    .JoinAlias(() => pcAlias.Money, () => mAlias)
    .Where(() => pAlias.Id == projectId)
    .Future<Partner>();

// Query to load the Partners with the Addresses
var partners2 = repo.Session.QueryOver<Partner>(() => paAlias)
    .JoinAlias(o => o.Project, () => pAlias)
    .Left.JoinAlias(() => paAlias.Addresses, () => aAlias)
    .Where(() => pAlias.Id == projectId)
    .Future<Partner>();

// when this is executed, the three queries are executed in one roundtrip
var list = projects.ToList();
Project project = list.FirstOrDefault();

My classes had different names but reflected the exact same structure. I replaced the names and I hope there are no typos.

Explanation:

The aliases are required for the joins. I defined three queries to load the Project you want, the Partners with their Costs and the Partners with their Addresses. By using the .Futures() I basically tell NHibernate to execute them in one roundtrip at the moment when I actually want the results, using projects.ToList().

This will result in three SQL statements that are indeed executed in one roundtrip. The three statements will return the following results: 1) 1 row with your Project 2) x rows with the Partners and their Costs (and the Money), where x is the total number of Costs for the Project's Partners 3) y rows with the Partners and their Addresses, where y is the total number of Addresses for the Project's Partners

Your db should return 1+x+y rows, instead of x*y rows, which would be a cartesian product. I do hope that your DB actually supports that feature.

Solution 2:

If you are using Linq on your NHibernate, you can simplify cartesian-prevention with this:

int projectId = 1;
var p1 = sess.Query<Project>().Where(x => x.ProjectId == projectId);


p1.FetchMany(x => x.Partners).ToFuture();

sess.Query<Partner>()
.Where(x => x.Project.ProjectId == projectId)
.FetchMany(x => x.Costs)
    .ThenFetch(x => x.Total)
.ToFuture();

sess.Query<Partner>()
.Where(x => x.Project.ProjectId == projectId)
.FetchMany(x => x.Addresses)
.ToFuture();


Project p = p1.ToFuture().Single();

Detailed explanation here: http://www.ienablemuch.com/2012/08/solving-nhibernate-thenfetchmany.html

Solution 3:

Instead of eager fetch multiple collections and get a nasty Cartesian product:

Person expectedPerson = session.Query<Person>()
    .FetchMany(p => p.Phones)
        .ThenFetch(p => p.PhoneType)
    .FetchMany(p => p.Addresses)
    .Where(x => x.Id == person.Id)
    .ToList().First();

You should batch children objects in one database call:

// create the first query
var query = session.Query<Person>()
      .Where(x => x.Id == person.Id);
// batch the collections
query
   .FetchMany(x => x.Addresses)
   .ToFuture();
query
   .FetchMany(x => x.Phones)
   .ThenFetch(p => p.PhoneType)
   .ToFuture();
// execute the queries in one roundtrip
Person expectedPerson = query.ToFuture().ToList().First();

I just wrote a blog post about it which explains how to avoid that using Linq, QueryOver or HQL http://blog.raffaeu.com/archive/2014/07/04/nhibernate-fetch-strategies/