Difference between IEnumerable Count() and Length
What are the key differences between IEnumerable
Count()
and Length
?
By calling Count on IEnumerable<T>
I'm assuming you're referring to the extension method Count
on System.Linq.Enumerable
. Length
is not a method on IEnumerable<T>
but rather a property on array types in .Net such as int[]
.
The difference is performance. TheLength
property is guaranteed to be a O(1) operation. The complexity of the Count
extension method differs based on runtime type of the object. It will attempt to cast to several types which support O(1) length lookup like ICollection<T>
via a Count
property. If none are available then it will enumerate all items and count them which has a complexity of O(N).
For example
int[] list = CreateSomeList();
Console.WriteLine(list.Length); // O(1)
IEnumerable<int> e1 = list;
Console.WriteLine(e1.Count()); // O(1)
IEnumerable<int> e2 = list.Where(x => x <> 42);
Console.WriteLine(e2.Count()); // O(N)
The value e2
is implemented as a C# iterator which does not support O(1) counting and hence the method Count
must enumerate the entire collection to determine how long it is.
Little addition to Jon Skeet's comment.
Here is the source code of the Count()
extension method:
.NET 3:
public static int Count<TSource>(this IEnumerable<TSource> source)
{
if (source == null)
{
throw Error.ArgumentNull("source");
}
ICollection<TSource> is2 = source as ICollection<TSource>;
if (is2 != null)
{
return is2.Count;
}
int num = 0;
using (IEnumerator<TSource> enumerator = source.GetEnumerator())
{
while (enumerator.MoveNext())
{
num++;
}
}
return num;
}
.NET 4:
public static int Count<TSource>(this IEnumerable<TSource> source)
{
if (source == null)
{
throw Error.ArgumentNull("source");
}
ICollection<TSource> is2 = source as ICollection<TSource>;
if (is2 != null)
{
return is2.Count;
}
ICollection is3 = source as ICollection;
if (is3 != null)
{
return is3.Count;
}
int num = 0;
using (IEnumerator<TSource> enumerator = source.GetEnumerator())
{
while (enumerator.MoveNext())
{
num++;
}
}
return num;
}
Length is a fixed property, e.g. of a single dimensional array or string. So there's never a count operation necessary (multi-dimensional arrays have a size of all dimensions multiplied). O(1) operation here means that retrieval time is always the same, no matter how many elements there are. A linear search would (opposed to this) be O(n).
The Count property on ICollections (List and List<T>, for example) can change, so it has either to be updated on Add/Remove operations, or when Count is requested after the Collection has changed. Depends on the implementation of the object.
The Count() method of LINQ basically iterates EVERY TIME it is called (except when the object is an ICollection type, then the ICollection.Count property is requested).
Note that IEnumerables are often not already defined object collections (like lists, arrays, hashtables etc.), but link to background operations, which generate results whenever they are requested (called deferred execution).
Typically, you have an SQL like LINQ statement like this (the typical application of deferred execution):
IEnumerable<Person> deptLeaders =
from p in persons
join d in departments
on p.ID equals d.LeaderID
orderby p.LastName, p.FirstName
select p;
Then, there's code like this:
if (deptLeaders.Count() > 0)
{
ReportNumberOfDeptLeaders(deptLeaders.Count());
if (deptLeaders.Count() > 20)
WarnTooManyDepartmentLeaders(deptLeaders.Count());
}
So, when a warning for too many Department Leaders is issued, .NET goes FOUR TIMES through the persons, checks them against the department leaders, sorts them by name and then counts the result objects.
And this is only when persons and departments are preset value collections, not queries themselves.