Why use array size 1 instead of pointer?

In one C++ open source project, I see this.

struct SomeClass {
  ...
  size_t data_length;
  char data[1];
  ...
}

What are the advantages of doing so rather than using a pointer?

struct SomeClass {
  ...
  size_t data_length;
  char* data;
  ...
}

The only thing I can think of is with the size 1 array version, users aren't expected to see NULL. Is there anything else?


With this, you don't have to allocate the memory elsewhere and make the pointer point to that.

  • No extra memory management
  • Accesses to the memory will hit the memory cache (much) more likely

The trick is to allocate more memory than sizeof (SomeClass), and make a SomeClass* point to it. Then the initial memory will be used by your SomeClass object, and the remaining memory can be used by the data. That is, you can say p->data[0] but also p->data[1] and so on up until you hit the end of memory you allocated.

Points can be made that this use results in undefined behavior though, because you declared your array to only have one element, but access it as if it contained more. But real compilers do allow this with the expected meaning because C++ has no alternative syntax to formulate these means (C99 has, it's called "flexible array member" there).


This is usually a quick(and dirty?) way of avoiding multiple memory allocations and deallocations, though it's more C stylish than C++.

That is, instead of this:

struct SomeClass *foo = malloc(sizeof *foo);
foo->data = malloc(data_len);
memcpy(foo->data,data,data_len);

....
free(foo->data);
free(foo);

You do something like this:

struct SomeClass *foo = malloc(sizeof *foo + data_len);
memcpy(foo->data,data,data_len);

...
free(foo);

In addition to saving (de)allocation calls, this can also save a bit of memory as there's no space for a pointer and you could even use space that otherwise could have been struct padding.


They are semantically different in your example.

char data[1] is a valid array of char with one uninitialized element allocated on the stack. You could write data[0] = 'w' and your program would be correct.

char* data; simply declares a pointer that is invalid until initialized to point to a valid address.


Usually you see this as the final member of a structure. Then whoever mallocs the structure, will allocate all the data bytes consecutively in memory as one block to "follow" the structure.

So if you need 16 bytes of data, you'd allocate an instance like this:

SomeClass * pObj = malloc(sizeof(SomeClass) + (16 - 1));

Then you can access the data as if it were an array:

pObj->data[12] = 0xAB;

And you can free all the stuff with one call, of course, as well.

The data member is a single-item array by convention because older C compilers (and apparently the current C++ standard) doesn't allow a zero-sized array. Nice further discussion here: http://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html


  1. The structure can be simply allocated as a single block of memory instead of multiple allocations that must be freed.

  2. It actually uses less memory because it doesn't need to store the pointer itself.

  3. There may also be performance advantages with caching due to the memory being contiguous.