JPA eager fetch does not join

What exactly does JPA's fetch strategy control? I can't detect any difference between eager and lazy. In both cases JPA/Hibernate does not automatically join many-to-one relationships.

Example: Person has a single address. An address can belong to many people. The JPA annotated entity classes look like:

@Entity
public class Person {
    @Id
    public Integer id;

    public String name;

    @ManyToOne(fetch=FetchType.LAZY or EAGER)
    public Address address;
}

@Entity
public class Address {
    @Id
    public Integer id;

    public String name;
}

If I use the JPA query:

select p from Person p where ...

JPA/Hibernate generates one SQL query to select from Person table, and then a distinct address query for each person:

select ... from Person where ...
select ... from Address where id=1
select ... from Address where id=2
select ... from Address where id=3

This is very bad for large result sets. If there are 1000 people it generates 1001 queries (1 from Person and 1000 distinct from Address). I know this because I'm looking at MySQL's query log. It was my understanding that setting address's fetch type to eager will cause JPA/Hibernate to automatically query with a join. However, regardless of the fetch type, it still generates distinct queries for relationships.

Only when I explicitly tell it to join does it actually join:

select p, a from Person p left join p.address a where ...

Am I missing something here? I now have to hand code every query so that it left joins the many-to-one relationships. I'm using Hibernate's JPA implementation with MySQL.

Edit: It appears (see Hibernate FAQ here and here) that FetchType does not impact JPA queries. So in my case I have explicitly tell it to join.


Solution 1:

JPA doesn't provide any specification on mapping annotations to select fetch strategy. In general, related entities can be fetched in any one of the ways given below

  • SELECT => one query for root entities + one query for related mapped entity/collection of each root entity = (n+1) queries
  • SUBSELECT => one query for root entities + second query for related mapped entity/collection of all root entities retrieved in first query = 2 queries
  • JOIN => one query to fetch both root entities and all of their mapped entity/collection = 1 query

So SELECT and JOIN are two extremes and SUBSELECT falls in between. One can choose suitable strategy based on her/his domain model.

By default SELECT is used by both JPA/EclipseLink and Hibernate. This can be overridden by using:

@Fetch(FetchMode.JOIN) 
@Fetch(FetchMode.SUBSELECT)

in Hibernate. It also allows to set SELECT mode explicitly using @Fetch(FetchMode.SELECT) which can be tuned by using batch size e.g. @BatchSize(size=10).

Corresponding annotations in EclipseLink are:

@JoinFetch
@BatchFetch

Solution 2:

"mxc" is right. fetchType just specifies when the relation should be resolved.

To optimize eager loading by using an outer join you have to add

@Fetch(FetchMode.JOIN)

to your field. This is a hibernate specific annotation.

Solution 3:

The fetchType attribute controls whether the annotated field is fetched immediately when the primary entity is fetched. It does not necessarily dictate how the fetch statement is constructed, the actual sql implementation depends on the provider you are using toplink/hibernate etc.

If you set fetchType=EAGER This means that the annotated field is populated with its values at the same time as the other fields in the entity. So if you open an entitymanager retrieve your person objects and then close the entitymanager, subsequently doing a person.address will not result in a lazy load exception being thrown.

If you set fetchType=LAZY the field is only populated when it is accessed. If you have closed the entitymanager by then a lazy load exception will be thrown if you do a person.address. To load the field you need to put the entity back into an entitymangers context with em.merge(), then do the field access and then close the entitymanager.

You might want lazy loading when constructing a customer class with a collection for customer orders. If you retrieved every order for a customer when you wanted to get a customer list this may be a expensive database operation when you only looking for customer name and contact details. Best to leave the db access till later.

For the second part of the question - how to get hibernate to generate optimised SQL?

Hibernate should allow you to provide hints as to how to construct the most efficient query but I suspect there is something wrong with your table construction. Is the relationship established in the tables? Hibernate may have decided that a simple query will be quicker than a join especially if indexes etc are missing.