C dynamically growing array
I have a program that reads a "raw" list of in-game entities, and I intend to make an array holding an index number (int) of an indeterminate number of entities, for processing various things. I would like to avoid using too much memory or CPU for keeping such indexes...
A quick and dirty solution I use so far is to declare, in the main processing function (local focus) the array with a size of the maximum game entities, and another integer to keep track of how many have been added to the list. This isn't satisfactory, as every list holds 3000+ arrays, which isn't that much, but feels like a waste, since I'll possible use the solution for 6-7 lists for varying functions.
I haven't found any C (not C++ or C#) specific solutions to achieve this. I can use pointers, but I am a bit afraid of using them (unless it's the only possible way).
The arrays do not leave the local function scope (they are to be passed to a function, then discarded), in case that changes things.
If pointers are the only solution, how can I keep track of them to avoid leaks?
Solution 1:
I can use pointers, but I am a bit afraid of using them.
If you need a dynamic array, you can't escape pointers. Why are you afraid though? They won't bite (as long as you're careful, that is). There's no built-in dynamic array in C, you'll just have to write one yourself. In C++, you can use the built-in std::vector
class. C# and just about every other high-level language also have some similar class that manages dynamic arrays for you.
If you do plan to write your own, here's something to get you started: most dynamic array implementations work by starting off with an array of some (small) default size, then whenever you run out of space when adding a new element, double the size of the array. As you can see in the example below, it's not very difficult at all: (I've omitted safety checks for brevity)
typedef struct {
int *array;
size_t used;
size_t size;
} Array;
void initArray(Array *a, size_t initialSize) {
a->array = malloc(initialSize * sizeof(int));
a->used = 0;
a->size = initialSize;
}
void insertArray(Array *a, int element) {
// a->used is the number of used entries, because a->array[a->used++] updates a->used only *after* the array has been accessed.
// Therefore a->used can go up to a->size
if (a->used == a->size) {
a->size *= 2;
a->array = realloc(a->array, a->size * sizeof(int));
}
a->array[a->used++] = element;
}
void freeArray(Array *a) {
free(a->array);
a->array = NULL;
a->used = a->size = 0;
}
Using it is just as simple:
Array a;
int i;
initArray(&a, 5); // initially 5 elements
for (i = 0; i < 100; i++)
insertArray(&a, i); // automatically resizes as necessary
printf("%d\n", a.array[9]); // print 10th element
printf("%d\n", a.used); // print number of elements
freeArray(&a);
Solution 2:
One simple solution involves mmap
. This is great if you can tolerate a POSIX solution. Just map a whole page and guard against overflows, since realloc
would fail for such values anyway. Modern OSes won't commit to the whole lot until you use it, and you can truncate files if you want.
Alternatively, there's realloc
. As with everything that seems scarier at first than it was later, the best way to get over the initial fear is to immerse yourself into the discomfort of the unknown! It is at times like that which we learn the most, after all.
Unfortunately, there are limitations. While you're still learning to use a function, you shouldn't assume the role of a teacher, for example. I often read answers from those who seemingly don't know how to use realloc
(i.e. the currently accepted answer!) telling others how to use it incorrectly, occasionally under the guise that they've omitted error handling, even though this is a common pitfall which needs mention. Here's an answer explaining how to use realloc
correctly. Take note that the answer is storing the return value into a different variable in order to perform error checking.
Every time you call a function, and every time you use an array, you are using a pointer. The conversions are occurring implicitly, which if anything should be even scarier, as it's the things we don't see which often cause the most problems. For example, memory leaks...
Array operators are pointer operators. array[x]
is really a shortcut for *(array + x)
, which can be broken down into: *
and (array + x)
. It's most likely that the *
is what confuses you. We can further eliminate the addition from the problem by assuming x
to be 0
, thus, array[0]
becomes *array
because adding 0
won't change the value...
... and thus we can see that *array
is equivalent to array[0]
. You can use one where you want to use the other, and vice versa. Array operators are pointer operators.
malloc
, realloc
and friends don't invent the concept of a pointer which you've been using all along; they merely use this to implement some other feature, which is a different form of storage duration, most suitable when you desire drastic, dynamic changes in size.
It is a shame that the currently accepted answer also goes against the grain of some other very well-founded advice on StackOverflow, and at the same time, misses an opportunity to introduce a little-known feature which shines for exactly this usecase: flexible array members! That's actually a pretty broken answer... :(
When you define your struct
, declare your array at the end of the structure, without any upper bound. For example:
struct int_list {
size_t size;
int value[];
};
This will allow you to unite your array of int
into the same allocation as your count
, and having them bound like this can be very handy!
sizeof (struct int_list)
will act as though value
has a size of 0, so it'll tell you the size of the structure with an empty list. You still need to add to the size passed to realloc
to specify the size of your list.
Another handy tip is to remember that realloc(NULL, x)
is equivalent to malloc(x)
, and we can use this to simplify our code. For example:
int push_back(struct int_list **fubar, int value) {
size_t x = *fubar ? fubar[0]->size : 0
, y = x + 1;
if ((x & y) == 0) {
void *temp = realloc(*fubar, sizeof **fubar
+ (x + y) * sizeof fubar[0]->value[0]);
if (!temp) { return 1; }
*fubar = temp; // or, if you like, `fubar[0] = temp;`
}
fubar[0]->value[x] = value;
fubar[0]->size = y;
return 0;
}
struct int_list *array = NULL;
The reason I chose to use struct int_list **
as the first argument may not seem immediately obvious, but if you think about the second argument, any changes made to value
from within push_back
would not be visible to the function we're calling from, right? The same goes for the first argument, and we need to be able to modify our array
, not just here but possibly also in any other function/s we pass it to...
array
starts off pointing at nothing; it is an empty list. Initialising it is the same as adding to it. For example:
struct int_list *array = NULL;
if (!push_back(&array, 42)) {
// success!
}
P.S. Remember to free(array);
when you're done with it!
Solution 3:
There are a couple of options I can think of.
- Linked List. You can use a linked list to make a dynamically growing array like thing. But you won't be able to do
array[100]
without having to walk through1-99
first. And it might not be that handy for you to use either. - Large array. Simply create an array with more than enough space for everything
- Resizing array. Recreate the array once you know the size and/or create a new array every time you run out of space with some margin and copy all the data to the new array.
- Linked List Array combination. Simply use an array with a fixed size and once you run out of space, create a new array and link to that (it would be wise to keep track of the array and the link to the next array in a struct).
It is hard to say what option would be best in your situation. Simply creating a large array is ofcourse one of the easiest solutions and shouldn't give you much problems unless it's really large.