How to replicate vector in c?

Vector and list aren't conceptually tied to C++. Similar structures can be implemented in C, just the syntax (and error handling) would look different. For example LodePNG implements a dynamic array with functionality very similar to that of std::vector. A sample usage looks like:

uivector v = {};
uivector_push_back(&v, 1);
uivector_push_back(&v, 42);
for(size_t i = 0; i < v.size; ++i)
    printf("%d\n", v.data[i]);
uivector_cleanup(&v);

As can be seen the usage is somewhat verbose and the code needs to be duplicated to support different types.

nothings/stb gives a simpler implementation that works with any types:

double *v = 0;
arrpush(v, 1.0);
arrpush(v, 42.0);
for(int i = 0; i < arrlen(v); ++i)
    printf("%g\n", v[i]);
arrfree(v);

It also provides hash maps, and the trick it uses for type-safe containers in C can be applied to other generic containers too.

Any of these methods can expand the underlying storage either by a call to realloc (see below), or by allocating new storage with malloc and freeing the old one with free -- which is equivalent to how std::vector grows its memory in C++.


A lot of C code, however, resorts to managing the memory directly with realloc:

void* newMem = realloc(oldMem, newSize);
if(!newMem) {
    // handle error
}
oldMem = newMem;

Note that realloc returns null in case of failure, yet the old memory is still valid. In such a situation this common (and incorrect) usage leaks memory:

oldMem = realloc(oldMem, newSize);
if(!oldMem) {
    // handle error
}

Compared to std::vector and the C equivalents from above, the simple realloc method does not provide O(1) amortized guarantee, even though realloc may sometimes be more efficient if it happens to avoid moving the memory around.


A lot of C projects end up implementing a vector-like API. Dynamic arrays are such a common need, that it's nice to abstract away the memory management as much as possible. A typical C implementation might look something like:

typedef struct dynamic_array_struct
{
  int* data;
  size_t capacity; /* total capacity */
  size_t size; /* number of elements in vector */
} vector;

Then they would have various API function calls which operate on the vector:

int vector_init(vector* v, size_t init_capacity)
{
  v->data = malloc(init_capacity * sizeof(int));
  if (!v->data) return -1;

  v->size = 0;
  v->capacity = init_capacity;

  return 0; /* success */
}

Then of course, you need functions for push_back, insert, resize, etc, which would call realloc if size exceeds capacity.

vector_resize(vector* v, size_t new_size);

vector_push_back(vector* v, int element);

Usually, when a reallocation is needed, capacity is doubled to avoid reallocating all the time. This is usually the same strategy employed internally by std::vector, except typically std::vector won't call realloc because of C++ object construction/destruction. Rather, std::vector might allocate a new buffer, and then copy construct/move construct the objects (using placement new) into the new buffer.

An actual vector implementation in C might use void* pointers as elements rather than int, so the code is more generic. Anyway, this sort of thing is implemented in a lot of C projects. See http://codingrecipes.com/implementation-of-a-vector-data-structure-in-c for an example vector implementation in C.


They would start by hiding the defining a structure that would hold members necessary for the implementation. Then providing a group of functions that would manipulate the contents of the structure.

Something like this:

typedef struct vec
{
    unsigned char* _mem;
    unsigned long _elems;
    unsigned long _elemsize;
    unsigned long _capelems;
    unsigned long _reserve;
};

vec* vec_new(unsigned long elemsize)
{
    vec* pvec = (vec*)malloc(sizeof(vec));
    pvec->_reserve = 10;
    pvec->_capelems = pvec->_reserve;
    pvec->_elemsize = elemsize;
    pvec->_elems = 0;
    pvec->_mem = (unsigned char*)malloc(pvec->_capelems * pvec->_elemsize);
    return pvec;
}

void vec_delete(vec* pvec)
{
    free(pvec->_mem);
    free(pvec);
}

void vec_grow(vec* pvec)
{
    unsigned char* mem = (unsigned char*)malloc((pvec->_capelems + pvec->_reserve) * pvec->_elemsize);
    memcpy(mem, pvec->_mem, pvec->_elems * pvec->_elemsize);
    free(pvec->_mem);
    pvec->_mem = mem;
    pvec->_capelems += pvec->_reserve;
}

void vec_push_back(vec* pvec, void* data, unsigned long elemsize)
{
    assert(elemsize == pvec->_elemsize);
    if (pvec->_elems == pvec->_capelems) {
        vec_grow(pvec);
    }
    memcpy(pvec->_mem + (pvec->_elems * pvec->_elemsize), (unsigned char*)data, pvec->_elemsize);
    pvec->_elems++;    
}

unsigned long vec_length(vec* pvec)
{
    return pvec->_elems;
}

void* vec_get(vec* pvec, unsigned long index)
{
    assert(index < pvec->_elems);
    return (void*)(pvec->_mem + (index * pvec->_elemsize));
}

void vec_copy_item(vec* pvec, void* dest, unsigned long index)
{
    memcpy(dest, vec_get(pvec, index), pvec->_elemsize);
}

void playwithvec()
{
    vec* pvec = vec_new(sizeof(int));

    for (int val = 0; val < 1000; val += 10) {
        vec_push_back(pvec, &val, sizeof(val));
    }

    for (unsigned long index = (int)vec_length(pvec) - 1; (int)index >= 0; index--) {
        int val;
        vec_copy_item(pvec, &val, index);
        printf("vec(%d) = %d\n", index, val);
    }

    vec_delete(pvec);
}

Further to this they would achieve encapsulation by using void* in the place of vec* for the function group, and actually hide the structure definition from the user by defining it within the C module containing the group of functions rather than the header. Also they would hide the functions that you would consider to be private, by leaving them out from the header and simply prototyping them only in the C module.


You can see implementation vc_vector:

struct vc_vector {
  size_t count;
  size_t element_size;
  size_t reserved_size;
  char* data;
  vc_vector_deleter* deleter;
};

...

vc_vector* vc_vector_create_copy(const vc_vector* vector) {
  vc_vector* new_vector = vc_vector_create(vector->reserved_size / vector->count,
                                           vector->element_size,
                                           vector->deleter);
  if (unlikely(!new_vector)) {
    return new_vector;
  }

  if (memcpy(vector->data,
             new_vector->data,
             new_vector->element_size * vector->count) == NULL) {
    vc_vector_release(new_vector);
    new_vector = NULL;
    return new_vector;
  }

  new_vector->count = vector->count;
  return new_vector;
}

To use it:

vc_vector* v1 = vc_vector_create(0, sizeof(int), NULL);
for (int i = 0; i < 10; ++i) {
  vc_vector_push_back(v1, &i);
}

// v1 = 0 1 2 3 4 5 6 7 8 9

vc_vector* v2 = vc_vector_create_copy(v1);

// v2 = 0 1 2 3 4 5 6 7 8 9 (copy of v1)

// to get pointer to int:

const int* v2_data = vc_vector_data(v1);