Why start an ArrayList with an initial capacity?
The usual constructor of ArrayList
is:
ArrayList<?> list = new ArrayList<>();
But there is also an overloaded constructor with a parameter for its initial capacity:
ArrayList<?> list = new ArrayList<>(20);
Why is it useful to create an ArrayList
with an initial capacity when we can append to it as we please?
If you know in advance what the size of the ArrayList
is going to be, it is more efficient to specify the initial capacity. If you don't do this, the internal array will have to be repeatedly reallocated as the list grows.
The larger the final list, the more time you save by avoiding the reallocations.
That said, even without pre-allocation, inserting n
elements at the back of an ArrayList
is guaranteed to take total O(n)
time. In other words, appending an element is an amortized constant-time operation. This is achieved by having each reallocation increase the size of the array exponentially, typically by a factor of 1.5
. With this approach, the total number of operations can be shown to be O(n)
.
Because ArrayList
is a dynamically resizing array data structure, which means it is implemented as an array with an initial (default) fixed size. When this gets filled up, the array will be extended to a double sized one. This operation is costly, so you want as few as possible.
So, if you know your upper bound is 20 items, then creating the array with initial length of 20 is better than using a default of, say, 15 and then resize it to 15*2 = 30
and use only 20 while wasting the cycles for the expansion.
P.S. - As AmitG says, the expansion factor is implementation specific (in this case (oldCapacity * 3)/2 + 1
)
Default size of Arraylist is 10.
/**
* Constructs an empty list with an initial capacity of ten.
*/
public ArrayList() {
this(10);
}
So if you are going to add 100 or more records, you can see the overhead of memory reallocation.
ArrayList<?> list = new ArrayList<>();
// same as new ArrayList<>(10);
So if you have any idea about the number of elements which will be stored in Arraylist its better to create Arraylist with that size instead of starting with 10 and then going on increasing it.
I actually wrote a blog post on the topic 2 months ago. The article is for C#'s List<T>
but Java's ArrayList
has a very similar implementation. Since ArrayList
is implemented using a dynamic array, it increases in size on demand. So the reason for the capacity constructor is for optimisation purposes.
When one of these resizings operation occurs, the ArrayList copies the contents of the array into a new array that is twice the capacity of the old one. This operation runs in O(n) time.
Example
Here is an example of how the ArrayList
would increase in size:
10
16
25
38
58
... 17 resizes ...
198578
297868
446803
670205
1005308
So the list starts with a capacity of 10
, when the 11th item is added it is increase by 50% + 1
to 16
. On the 17th item the ArrayList
is increased again to 25
and so on. Now consider the example where we're creating a list where the desired capacity is already known as 1000000
. Creating the ArrayList
without the size constructor will call ArrayList.add
1000000
times which takes O(1) normally or O(n) on resize.
1000000 + 16 + 25 + ... + 670205 + 1005308 = 4015851 operations
Compare this using the constructor and then calling ArrayList.add
which is guaranteed to run in O(1).
1000000 + 1000000 = 2000000 operations
Java vs C#
Java is as above, starting at 10
and increasing each resize at 50% + 1
. C# starts at 4
and increases much more aggressively, doubling at each resize. The 1000000
adds example from above for C# uses 3097084
operations.
References
- My blog post on C#'s List<T>
- Java's ArrayList source code