Why do we need to specify the column size when passing a 2D array as a parameter?
When it comes to describing parameters, arrays always decay into pointers to their first element.
When you pass an array declared as int Array[3]
to the function void foo(int array[])
, it decays into a pointer to the beginning of the array i.e. int *Array;
. Btw, you can describe a parameter as int array[3]
or int array[6]
or even int *array
- all these will be equivalent and you can pass any integer array without problems.
In case of arrays of arrays (2D arrays), it decays to a pointer to its first element as well, which happens to be a single dimensional array i.e. we get int (*Array)[3]
.
Specifying the size here is important. If it were not mandatory, there won't be any way for compiler to know how to deal with expression Array[2][1]
, for example.
To dereference that a compiler needs to compute the offset of the item we need in a contiguous block of memory (int Array[2][3]
is a contiguous block of integers), which should be easy for pointers. If a
is a pointer, then a[N]
is expanded as start_address_in_a + N * size_of_item_being_pointed_by_a
. In case of expression Array[2][1]
inside a function (we want to access this element) the Array
is a pointer to a single dimensional array and the same formula applies. The number of bytes in the last square bracket is required to find size_of_item_being_pointed_by_a
. If we had just Array[][]
it would be impossible to find it out and hence impossible to dereference an array element we need.
Without the size, pointers arithmetics wouldn't work for arrays of arrays. What address would Array + 2
produce: advance the address in Array
2 bytes ahead (wrong) or advance the pointer 3* sizeof(int) * 2
bytes ahead?
In C/C++, even 2-D arrays are stored sequentially, one row after another in memory. So, when you have (in a single function):
int a[5][3];
int *head;
head = &a[0][0];
a[2][1] = 2; // <--
The element you are actually accessing with a[2][1]
is *(head + 2*3 + 1)
, cause sequentially, that element is after 3 elements of the 0
row, and 3 elements of the 1
row, and then one more index further.
If you declare a function like:
void some_function(int array[][]) {...}
syntactically, it should not be an error. But, when you try to access array[2][3]
now, you can't tell which element is supposed to be accessed. On the other hand, when you have:
void some_function(int array[][5]) {...}
you know that with array[2][3]
, it can be determined that you are actually accessing element at the memory address *(&array[0][0] + 2*5 + 3)
because the function knows the size of the second dimension.
There is one other option, as previously suggested, you can declare a function like:
void some_function(int *array, int cols) { ... }
because this way, you are calling the function with the same "information" as before -- the number of columns. You access the array elements a bit differently then: you have to write *(array + i*cols + j)
where you would usually write array[i][j]
, cause array
is now a pointer to integer (not to a pointer).
When you declare a function like this, you have to be careful to call it with the number of columns that are actually declared for the array, not only used. So, for example:
int main(){
int a[5][5];
int i, j;
for (i = 0; i < 3; ++i){
for (int j=0; j < 3; ++j){
scanf("%d", &a[i][j]);
}
}
some_function(&a[i][j], 5); // <- correct
some_function(&a[i][j], 3); // <- wrong
return 0;
}