What is the ideal growth rate for a dynamically allocated array?
I remember reading many years ago why 1.5 is preferred over two, at least as applied to C++ (this probably doesn't apply to managed languages, where the runtime system can relocate objects at will).
The reasoning is this:
- Say you start with a 16-byte allocation.
- When you need more, you allocate 32 bytes, then free up 16 bytes. This leaves a 16-byte hole in memory.
- When you need more, you allocate 64 bytes, freeing up the 32 bytes. This leaves a 48-byte hole (if the 16 and 32 were adjacent).
- When you need more, you allocate 128 bytes, freeing up the 64 bytes. This leaves a 112-byte hole (assuming all previous allocations are adjacent).
- And so and and so forth.
The idea is that, with a 2x expansion, there is no point in time that the resulting hole is ever going to be large enough to reuse for the next allocation. Using a 1.5x allocation, we have this instead:
- Start with 16 bytes.
- When you need more, allocate 24 bytes, then free up the 16, leaving a 16-byte hole.
- When you need more, allocate 36 bytes, then free up the 24, leaving a 40-byte hole.
- When you need more, allocate 54 bytes, then free up the 36, leaving a 76-byte hole.
- When you need more, allocate 81 bytes, then free up the 54, leaving a 130-byte hole.
- When you need more, use 122 bytes (rounding up) from the 130-byte hole.
In the limit as n → ∞, it would be the golden ratio: ϕ = 1.618...
For finite n, you want something close, like 1.5.
The reason is that you want to be able to reuse older memory blocks, to take advantage of caching and avoid constantly making the OS give you more memory pages. The equation you'd solve to ensure that a subsequent allocation can re-use all prior blocks reduces to xn − 1 − 1 = xn + 1 − xn, whose solution approaches x = ϕ for large n. In practice n is finite and you'll want to be able to reusing the last few blocks every few allocations, and so 1.5 is great for ensuring that.
(See the link for a more detailed explanation.)
It will entirely depend on the use case. Do you care more about the time wasted copying data around (and reallocating arrays) or the extra memory? How long is the array going to last? If it's not going to be around for long, using a bigger buffer may well be a good idea - the penalty is short-lived. If it's going to hang around (e.g. in Java, going into older and older generations) that's obviously more of a penalty.
There's no such thing as an "ideal growth factor." It's not just theoretically application dependent, it's definitely application dependent.
2 is a pretty common growth factor - I'm pretty sure that's what ArrayList
and List<T>
in .NET uses. ArrayList<T>
in Java uses 1.5.
EDIT: As Erich points out, Dictionary<,>
in .NET uses "double the size then increase to the next prime number" so that hash values can be distributed reasonably between buckets. (I'm sure I've recently seen documentation suggesting that primes aren't actually that great for distributing hash buckets, but that's an argument for another answer.)