May I treat a 2D array as a contiguous 1D array?

Consider the following code:

int a[25][80];
a[0][1234] = 56;
int* p = &a[0][0];
p[1234] = 56;

Does the second line invoke undefined behavior? How about the fourth line?


It's up to interpretation. While the contiguity requirements of arrays don't leave much to the imagination in terms of how to layout a multidimensional arrays (this has been pointed out before), notice that when you're doing p[1234] you're indexing the 1234th element of the zeroth row of only 80 columns. Some interpret the only valid indices to be 0..79 (&p[80] being a special case).

Information from the C FAQ which is the collected wisdom of Usenet on matters relevant to C. (I do not think C and C++ differ on that matter and that this is very much relevant.)


Both lines do result in undefined behavior.

Subscripting is interpreted as pointer addition followed by an indirection, that is, a[0][1234]/p[1234] is equivalent to *(a[0] + 1234)/*(p + 1234). According to [expr.add]/4 (here I quote the newest draft, while for the time OP is proposed, you can refer to this comment, and the conclusion is the same):

If the expression P points to element x[i] of an array object x with n elements, the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) element x[i+j] if 0≤i+j≤n; otherwise, the behavior is undefined.

since a[0](decayed to a pointer to a[0][0])/p points to an element of a[0] (as an array), and a[0] only has size 80, the behavior is undefined.


As Language Lawyer pointed out in the comment, the following program does not compile.

constexpr int f(const int (&a)[2][3])
{
    auto p = &a[0][0];
    return p[3];
}

int main()
{
    constexpr int a[2][3] = { 1, 2, 3, 4, 5, 6, };
    constexpr int i = f(a);
}

The compiler detected such undefined behaviors when it appears in a constant expression.


In the language the Standard was written to describe, there would be no problem with invoking a function like:

void print_array(double *d, int rows, int cols)
{
  int r,c;
  for (r = 0; r < rows; r++)
  {
    printf("%4d: ", r);
    for (c = 0; c < cols; c++)
      printf("%10.4f ", d[r*cols+c]);
    printf("\n");
  }
}

on a double[10][4], or a double[50][40], or any other size, provided that the total number of elements in the array was less than rows*cols. Indeed, the guarantee that the row stride of a T[R][C] would equal C * sizeof (T) was designed among other things to make it possible to write code that could work with arbitrarily-sized multi-dimensional arrays.

On the other hand, the authors of the Standard recognized that when implementations are given something like:

double d[10][10];
double test(int i)
{
  d[1][0] = 1.0;
  d[0][i] = 2.0;
  return d[1][0]; 
}

allowing them to generate code that would assume that d[1][0] would still hold 1.0 when the return executes, or allowing them to generate code that would trap if i is greater than 10, would allow them to be more suitable for some purposes than requiring that they silently return 2.0 if invoked with i==10.

Nothing in the Standard makes any distinction between those scenarios. While it would have been possible for the Standard to have included rules that would say that the second example invokes UB if i >= 10 without affecting the first example (e.g. say that applying [N] to an array doesn't cause it to decay to a pointer, but instead yields the Nth element, which must exist in that array), the Standard instead relies upon the fact that implementations are allowed to behave in useful fashion even when not required to do so, and compiler writers should presumably be capable of recognizing situations like the first example when doing so would benefit their customers.

Since the Standard never sought to fully define everything that programmers would need to do with arrays, it should not be looked to for guidance as to what constructs quality implementations should support.