Is it possible to implement a dynamic array without reallocation?

The default way to implement dynamic arrays is to use realloc. Once len == capacity we use realloc to grow our array. This can cause copying of the whole array to another heap location. I don't want this copying to happen, since I'm designing a dynamic array that should be able to store large amount of elements, and the system that would run this code won't be able to handle such a heavy operation.

Is there a way to achieve that?

I'm fine with loosing some performance - O(logN) for search instead of O(1) is okay. I was thinking that I could use a hashtable for this, but it looks like I'm in a deadlock since in order to implement such a hashtable I would need a dynamic array in the first place.

Thanks!


Not really, not in the general case.

The copy happens when the memory manager can't increase the the current allocation, and needs to move the memory block somewhere else.

One thing you can try is to allocate fixed sized blocks and keep a dynamic array pointing to the blocks. This way the blocks don't need to be reallocated, keeping the large payloads in place. If you need to reallocate, you only reallocate the array of reference which should be much cheaper (move 8 bytes instead 1 or more MB). The ideal case the block size is about sqrt(N), so it's not working in a very general case (any fixed size will be some large or some small for some values).


If you are not against small allocations, you could use a singly linked list of tables, where each new table doubles the capacity and becomes the front of the list. If you want to access the last element, you can just get the value from the last allocated block, which is O(1). If you want to access the first element, you have to run through the list of allocated blocks to get to the correct block. Since the length of each block is two times the previous one, this means the access complexity is O(logN).

This data structures relies on the same principles that dynamic arrays use (doubling the size of the array when expanding), but instead of copying the values after allocating a new block, it keeps track of the previous block, meaning accessing the previous blocks adds overhead but not accessing the last ones.

The index is not a position in a specific block, but in an imaginary concatenation of all the blocks, starting from the first allocated block. Thus, this data structure cannot be implemented as a recursive type because it needs a wrapper keeping track of the total capacity to compute which block is refered to.

For example: There are three blocks, of sizes 100, 200, 400. Accessing 150th value (index 149 if starting from 0) means the 50th value of the second block. The interface needs to know the total length is 700, compare the index to 700 - 400 to determine whether the index refers to the last block (if the index is above 300) or a previous block. Then, the interface compares with the capacity of the previous blocks (300 - 200) and knows 150 is in the second block. This algorithm can have as many iterations as there are blocks, which is O(logN). Again, if you only try to access the last value, the complexity becomes O(1).

If you have concerns about copy times for real time applications or large amounts of data, this data structure could be better than having a contiguous storage and having to copy all of your data in some cases.