C pointers : pointing to an array of fixed size

This question goes out to the C gurus out there:

In C, it is possible to declare a pointer as follows:

char (* p)[10];

.. which basically states that this pointer points to an array of 10 chars. The neat thing about declaring a pointer like this is that you will get a compile time error if you try to assign a pointer of an array of different size to p. It will also give you a compile time error if you try to assign the value of a simple char pointer to p. I tried this with gcc and it seems to work with ANSI, C89 and C99.

It looks to me like declaring a pointer like this would be very useful - particularly, when passing a pointer to a function. Usually, people would write the prototype of such a function like this:

void foo(char * p, int plen);

If you were expecting a buffer of an specific size, you would simply test the value of plen. However, you cannot be guaranteed that the person who passes p to you will really give you plen valid memory locations in that buffer. You have to trust that the person who called this function is doing the right thing. On the other hand:

void foo(char (*p)[10]);

..would force the caller to give you a buffer of the specified size.

This seems very useful but I have never seen a pointer declared like this in any code I have ever ran across.

My question is: Is there any reason why people do not declare pointers like this? Am I not seeing some obvious pitfall?


Solution 1:

What you are saying in your post is absolutely correct. I'd say that every C developer comes to exactly the same discovery and to exactly the same conclusion when (if) they reach certain level of proficiency with C language.

When the specifics of your application area call for an array of specific fixed size (array size is a compile-time constant), the only proper way to pass such an array to a function is by using a pointer-to-array parameter

void foo(char (*p)[10]);

(in C++ language this is also done with references

void foo(char (&p)[10]);

).

This will enable language-level type checking, which will make sure that the array of exactly correct size is supplied as an argument. In fact, in many cases people use this technique implicitly, without even realizing it, hiding the array type behind a typedef name

typedef int Vector3d[3];

void transform(Vector3d *vector);
/* equivalent to `void transform(int (*vector)[3])` */
...
Vector3d vec;
...
transform(&vec);

Note additionally that the above code is invariant with relation to Vector3d type being an array or a struct. You can switch the definition of Vector3d at any time from an array to a struct and back, and you won't have to change the function declaration. In either case the functions will receive an aggregate object "by reference" (there are exceptions to this, but within the context of this discussion this is true).

However, you won't see this method of array passing used explicitly too often, simply because too many people get confused by a rather convoluted syntax and are simply not comfortable enough with such features of C language to use them properly. For this reason, in average real life, passing an array as a pointer to its first element is a more popular approach. It just looks "simpler".

But in reality, using the pointer to the first element for array passing is a very niche technique, a trick, which serves a very specific purpose: its one and only purpose is to facilitate passing arrays of different size (i.e. run-time size). If you really need to be able to process arrays of run-time size, then the proper way to pass such an array is by a pointer to its first element with the concrete size supplied by an additional parameter

void foo(char p[], unsigned plen);

Actually, in many cases it is very useful to be able to process arrays of run-time size, which also contributes to the popularity of the method. Many C developers simply never encounter (or never recognize) the need to process a fixed-size array, thus remaining oblivious to the proper fixed-size technique.

Nevertheless, if the array size is fixed, passing it as a pointer to an element

void foo(char p[])

is a major technique-level error, which unfortunately is rather widespread these days. A pointer-to-array technique is a much better approach in such cases.

Another reason that might hinder the adoption of the fixed-size array passing technique is the dominance of naive approach to typing of dynamically allocated arrays. For example, if the program calls for fixed arrays of type char[10] (as in your example), an average developer will malloc such arrays as

char *p = malloc(10 * sizeof *p);

This array cannot be passed to a function declared as

void foo(char (*p)[10]);

which confuses the average developer and makes them abandon the fixed-size parameter declaration without giving it a further thought. In reality though, the root of the problem lies in the naive malloc approach. The malloc format shown above should be reserved for arrays of run-time size. If the array type has compile-time size, a better way to malloc it would look as follows

char (*p)[10] = malloc(sizeof *p);

This, of course, can be easily passed to the above declared foo

foo(p);

and the compiler will perform the proper type checking. But again, this is overly confusing to an unprepared C developer, which is why you won't see it in too often in the "typical" average everyday code.

Solution 2:

I would like to add to AndreyT's answer (in case anyone stumbles upon this page looking for more info on this topic):

As I begin to play more with these declarations, I realize that there is major handicap associated with them in C (apparently not in C++). It is fairly common to have a situation where you would like to give a caller a const pointer to a buffer you have written into. Unfortunately, this is not possible when declaring a pointer like this in C. In other words, the C standard (6.7.3 - Paragraph 8) is at odds with something like this:


   int array[9];

   const int (* p2)[9] = &array;  /* Not legal unless array is const as well */

This constraint does not seem to be present in C++, making these type of declarations far more useful. But in the case of C, it is necessary to fall back to a regular pointer declaration whenever you want a const pointer to the fixed size buffer (unless the buffer itself was declared const to begin with). You can find more info in this mail thread: link text

This is a severe constraint in my opinion and it could be one of the main reasons why people do not usually declare pointers like this in C. The other being the fact that most people do not even know that you can declare a pointer like this as AndreyT has pointed out.

Solution 3:

The obvious reason is that this code doesn't compile:

extern void foo(char (*p)[10]);
void bar() {
  char p[10];
  foo(p);
}

The default promotion of an array is to an unqualified pointer.

Also see this question, using foo(&p) should work.

Solution 4:

I also want to use this syntax to enable more type checking.

But I also agree that the syntax and mental model of using pointers is simpler, and easier to remember.

Here are some more obstacles I have come across.

  • Accessing the array requires using (*p)[]:

    void foo(char (*p)[10])
    {
        char c = (*p)[3];
        (*p)[0] = 1;
    }
    

    It is tempting to use a local pointer-to-char instead:

    void foo(char (*p)[10])
    {
        char *cp = (char *)p;
        char c = cp[3];
        cp[0] = 1;
    }
    

    But this would partially defeat the purpose of using the correct type.

  • One has to remember to use the address-of operator when assigning an array's address to a pointer-to-array:

    char a[10];
    char (*p)[10] = &a;
    

    The address-of operator gets the address of the whole array in &a, with the correct type to assign it to p. Without the operator, a is automatically converted to the address of the first element of the array, same as in &a[0], which has a different type.

    Since this automatic conversion is already taking place, I am always puzzled that the & is necessary. It is consistent with the use of & on variables of other types, but I have to remember that an array is special and that I need the & to get the correct type of address, even though the address value is the same.

    One reason for my problem may be that I learned K&R C back in the 80s, which did not allow using the & operator on whole arrays yet (although some compilers ignored that or tolerated the syntax). Which, by the way, may be another reason why pointers-to-arrays have a hard time to get adopted: they only work properly since ANSI C, and the & operator limitation may have been another reason to deem them too awkward.

  • When typedef is not used to create a type for the pointer-to-array (in a common header file), then a global pointer-to-array needs a more complicated extern declaration to share it across files:

    fileA:
    char (*p)[10];
    
    fileB:
    extern char (*p)[10];