How To Count Associated Entities Without Fetching Them In Entity Framework
I've been wondering about this one for a while now, so I thought it would be worth using my first Stack Overflow post to ask about it.
Imagine I have a discussion with an associated list of messages:
DiscussionCategory discussionCategory = _repository.GetDiscussionCategory(id);
discussionCategory.Discussions is a list of Discussion entities which is not currently loaded.
What I want is to be able to iterate through the discussions in a discussionCategory and say how many messages are in each discussion without fetching the message data.
When I have tried this before I have had to load the Discussions and the Messages so that I could do something like this:
discussionCategory.Discussions.Attach(Model.Discussions.CreateSourceQuery().Include("Messages").AsEnumerable());
foreach(Discussion discussion in discussionCategory.Discussions)
{
int messageCount = discussion.Messages.Count;
Console.WriteLine(messageCount);
}
This seems rather inefficient to me as I am fetching potentially hundreds of message bodies from the database and holding them in memory when all I wish to do is count their number for presentational purposes.
I have seen some questions which touch on this subject but they did not seem to address it directly.
Thanks in advance for any thoughts you may have on this subject.
Update - Some more code as requested:
public ActionResult Details(int id)
{
Project project = _repository.GetProject(id);
return View(project);
}
Then in the view (just to test it out):
Model.Discussions.Load();
var items = from d in Model.Discussions select new { Id = d.Id, Name = d.Name, MessageCount = d.Messages.Count() };
foreach (var item in items) {
//etc
I hope that makes my problem a bit clearer. Let me know if you need any more code details.
Easy; just project onto a POCO (or anonymous) type:
var q = from d in Model.Discussions
select new DiscussionPresentation
{
Subject = d.Subject,
MessageCount = d.Messages.Count(),
};
When you look at the generated SQL, you'll see that the Count()
is done by the DB server.
Note that this works in both EF 1 and EF 4.
If you are using Entity Framework 4.1 or later, you can use:
var discussion = _repository.GetDiscussionCategory(id);
// Count how many messages the discussion has
var messageCount = context.Entry(discussion)
.Collection(d => d.Messages)
.Query()
.Count();
Source: http://msdn.microsoft.com/en-US/data/jj574232
I know this is an old question but it seems to be an ongoing problem and none of the answers above provide a good way to deal with SQL aggregates in list views.
I am assuming straight POCO models and Code First like in the templates and examples. While the SQL View solution is nice from a DBA point of view, it re-introduces the challenge of maintaining both code and database structures in parallel. For simple SQL aggregate queries, you won't see much speed gain from a View. What you really need to avoid are multiple (n+1) database queries, as in the examples above. If you have 5000 parent entities and you are counting child entities (e.g. messages per discussion), that's 5001 SQL queries.
You can return all those counts in a single SQL query. Here's how.
-
Add a placeholder property to your class model using the
[NotMapped]
data annotation from theSystem.ComponentModel.DataAnnotations.Schema
namespace. This gives you a place to store the calculated data without actually adding a column to your database or projecting to unnecessary View Models.... using System.ComponentModel.DataAnnotations; using System.ComponentModel.DataAnnotations.Schema; namespace MyProject.Models { public class Discussion { [Key] public int ID { get; set; } ... [NotMapped] public int MessageCount { get; set; } public virtual ICollection<Message> Messages { get; set; } } }
-
In your Controller, get the list of parent objects.
var discussions = db.Discussions.ToList();
-
Capture the counts in a Dictionary. This generates a single SQL GROUP BY query with all parent IDs and child object counts. (Presuming
DiscussionID
is the FK inMessages
.)var _counts = db.Messages.GroupBy(m => m.DiscussionID).ToDictionary(d => d.Key, d => d.Count());
-
Loop through the parent objects, look up the count from the dictionary, and store in the placeholder property.
foreach (var d in discussions) { d.MessageCount = (_counts.ContainsKey(d.ID)) ? _counts[d.ID] : 0; }
-
Return your discussion list.
return View(discussions);
-
Reference the
MessageCount
property in the View.@foreach (var item in Model) { ... @item.MessageCount ... }
Yes, you could just stuff that Dictionary into the ViewBag and do the lookup directly in the View, but that muddies your view with code that doesn't need to be there.
In the end, I wish EF had a way to do "lazy counting". The problem with both lazy and explicit loading is you're loading the objects. And if you have to load to count, that's a potential performance problem. Lazy counting wouldn't solve the n+1 problem in list views but it sure would be nice to be able to just call @item.Messages.Count
from the View without having to worry about potentially loading tons of unwanted object data.
Hope this helps.
If this isn't a one off, and you find yourself needing to count a number of different associated entities, a database view might be a simpler (and potentially more appropriate) choice:
-
Create your database view.
Assuming you want all of the original entity properties plus the associated message count:
CREATE VIEW DiscussionCategoryWithStats AS SELECT dc.*, (SELECT count(1) FROM Messages m WHERE m.DiscussionCategoryId = dc.Id) AS MessageCount FROM DiscussionCategory dc
(If you're using Entity Framework Code First Migrations, see this SO answer on how to create a view.)
-
In EF, simply use the view instead of the original entity:
// You'll need to implement this! DiscussionCategoryWithStats dcs = _repository.GetDiscussionCategoryWithStats(id); int i = dcs.MessageCount; ...