Which Java Type do you use for JPA collections and why?

Which of the following collection types do you use in your JPA domain model and why:

  • java.util.Collection
  • java.util.List
  • java.util.Set

I was wondering whether there are some ground rules for this.

UPDATE I know the difference between a Set and a List. A List allows duplicates and has an order and a Set cannot contain duplicate elements and does not define order. I'm asking this question in the context of JPA. If you strictly follow the definition, then you should always end up using the Set type, since your collection is stored in relational database, where you can't have duplicates and where you have define an order by yourself, i.e. the order in you Java List is not necessarily preserved in the DB.

For example, most of the time I'm using the List type, not because it has an order or allows duplicates (which I can't have anyway), because some of the components in my component library require a list.


Like your own question suggests, the key is the domain, not JPA. JPA is just a framework which you can (and should) use in a way which best fits your problem. Choosing a suboptimal solution because of framework (or its limits) is usually a warning bell.

When I need a set and never care about order, I use a Set. When for some reason order is important (ordered list, ordering by date, etc.), then a List.

You seem to be well aware of the difference between Collection, Set, and List. The only reason to use one vs. the other depends only on your needs. You can use them to communicate to users of your API (or your future self) the properties of your collection (which may be subtle or implicit).

This is follows the exact same rules as using different collection types anywhere else throughout your code. You could use Object or Collections for all your references, yet in most cases you use more concrete types.

For example, when I see a List, I know it comes sorted in some way, and that duplicates are either acceptable or irrelevant for this case. When I see a Set, I usually expect it to have no duplicates and no specific order (unless it's a SortedSet). When I see a Collection, I don't expect anything more from it than to contain some entities.

Regarding list ordering... Yes, it can be preserved. And even if it's not and you just use @OrderBy, it still can be useful. Think about the example of event log sorted by timestamp by default. Artificially reordering the list makes little sense, but still it can be useful that it comes sorted by default.


The question of using a Set or a List is much more difficult I think. At least when you use hibernate as JPA implementation. If you use a List in hibernate, it automatically switch to the "Bags" paradigm, where duplicates CAN exist.

And that decision has significant influence on the queries hibernate executes. Here a little example:

There are two entities, employee and company, a typical many-to-many relation. for mapping those entities to each other, a JoinTable (lets call it "employeeCompany") exist.

You choose the datatype List on both entities (Company/Employee)

So if you now decide to remove Employee Joe from CompanyXY, hibernate executes the following queries:

delete from employeeCompany where employeeId = Joe;
insert into employeeCompany(employeeId,companyId) values (Joe,CompanyXA);
insert into employeeCompany(employeeId,companyId) values (Joe,CompanyXB);
insert into employeeCompany(employeeId,companyId) values (Joe,CompanyXC);
insert into employeeCompany(employeeId,companyId) values (Joe,CompanyXD);
insert into employeeCompany(employeeId,companyId) values (Joe,CompanyXE);

And now the question: why the hell does hibernate not only execute that query?

delete from employeeCompany where employeeId = Joe AND company = companyXY;

The answer is simple (and thx a lot to Nirav Assar for his blogpost): It can't. In a world of bags, delete all & re-insert all remaining is the only proper way! Read that for more clarification. http://assarconsulting.blogspot.fr/2009/08/why-hibernate-does-delete-all-then-re.html

Now the big conclusion:

If you choose a Set instead of a List in your Employee/Company - Entities, you don't have that Problem and only one query is executed!

And why that? Because hibernate is no longer in a world of bags (as you know, Sets allows no duplicates) and executing only one query is now possible.

So the decision between List and Sets is not that simple, at least when it comes to queries & performance!