Why start an ArrayList with an initial capacity?

The usual constructor of ArrayList is:

ArrayList<?> list = new ArrayList<>();

But there is also an overloaded constructor with a parameter for its initial capacity:

ArrayList<?> list = new ArrayList<>(20);

Why is it useful to create an ArrayList with an initial capacity when we can append to it as we please?


If you know in advance what the size of the ArrayList is going to be, it is more efficient to specify the initial capacity. If you don't do this, the internal array will have to be repeatedly reallocated as the list grows.

The larger the final list, the more time you save by avoiding the reallocations.

That said, even without pre-allocation, inserting n elements at the back of an ArrayList is guaranteed to take total O(n) time. In other words, appending an element is an amortized constant-time operation. This is achieved by having each reallocation increase the size of the array exponentially, typically by a factor of 1.5. With this approach, the total number of operations can be shown to be O(n).


Because ArrayList is a dynamically resizing array data structure, which means it is implemented as an array with an initial (default) fixed size. When this gets filled up, the array will be extended to a double sized one. This operation is costly, so you want as few as possible.

So, if you know your upper bound is 20 items, then creating the array with initial length of 20 is better than using a default of, say, 15 and then resize it to 15*2 = 30 and use only 20 while wasting the cycles for the expansion.

P.S. - As AmitG says, the expansion factor is implementation specific (in this case (oldCapacity * 3)/2 + 1)


Default size of Arraylist is 10.

/**
 * Constructs an empty list with an initial capacity of ten.
 */
public ArrayList() {
    this(10);
}   

So if you are going to add 100 or more records, you can see the overhead of memory reallocation.

ArrayList<?> list = new ArrayList<>();    
// same as  new ArrayList<>(10);      

So if you have any idea about the number of elements which will be stored in Arraylist its better to create Arraylist with that size instead of starting with 10 and then going on increasing it.


I actually wrote a blog post on the topic 2 months ago. The article is for C#'s List<T> but Java's ArrayList has a very similar implementation. Since ArrayList is implemented using a dynamic array, it increases in size on demand. So the reason for the capacity constructor is for optimisation purposes.

When one of these resizings operation occurs, the ArrayList copies the contents of the array into a new array that is twice the capacity of the old one. This operation runs in O(n) time.

Example

Here is an example of how the ArrayList would increase in size:

10
16
25
38
58
... 17 resizes ...
198578
297868
446803
670205
1005308

So the list starts with a capacity of 10, when the 11th item is added it is increase by 50% + 1 to 16. On the 17th item the ArrayList is increased again to 25 and so on. Now consider the example where we're creating a list where the desired capacity is already known as 1000000. Creating the ArrayList without the size constructor will call ArrayList.add 1000000 times which takes O(1) normally or O(n) on resize.

1000000 + 16 + 25 + ... + 670205 + 1005308 = 4015851 operations

Compare this using the constructor and then calling ArrayList.add which is guaranteed to run in O(1).

1000000 + 1000000 = 2000000 operations

Java vs C#

Java is as above, starting at 10 and increasing each resize at 50% + 1. C# starts at 4 and increases much more aggressively, doubling at each resize. The 1000000 adds example from above for C# uses 3097084 operations.

References

  • My blog post on C#'s List<T>
  • Java's ArrayList source code