Performance of 2-dimensional array vs 1-dimensional array

arrays c

In C, is there a difference in time and space between an m×n 2-dimensional array vs a 1-dimensional array of length m×n (for large values of m and n)? Will accessing elements be faster with a 1-dimensional array?

In C, 2-dimensional arrays are just a neat indexing scheme for 1-dimensional arrays. Just like with a 1D array, 2D arrays allocate a single block of contiguous memory, and the A[row][col] notation is similar to saying A[row*NCOLS+col].

Usually if you were to implement your own multidimensional arrays using single dimensional arrays, you'd write an indexing function:

int getIndex(int row, int col) { return row*NCOLS+col; }

Assuming your compiler inlines this function, the performance here would be precisely the same as if you used the built in 'indexing function' of 2D arrays.

To illustrate:

#define NROWS 10
#define NCOLS 20

This:

int main(int argc, char *argv[]) {
    int myArr[NROWS*NCOLS];
    for (int i=0; i<NROWS; ++i) {
       for (int j=0; j<NCOLS; ++j) {
          myArr[getIndex(i,j)] = i+j;
       }
    }
    return 0;
}

Should perform the same as this:

int main(int argc, char *argv[]) {
    int myArr[NROWS][NCOLS];
    for (int i=0; i<NROWS; ++i) {
       for (int j=0; j<NCOLS; ++j) {
          myArr[i][j] = i+j;
       }
    }
    return 0;
}

Though as AraK pointed out, if you are jumping around rows a lot, and the rows are very large, you could hit a lot of page faults... in that case the custom indexing function (with rows and cols switched around) could help, but so could simply changing which of the dimensions in a 2-dimensional array you treat as the rows and which you treat as the columns.

Actually, if you use the so-called two-dimensional array in C, the compiler will do the mapping into one-dimensional array for you. If you use one-dimensional array and you like to treat it as a two-dimensional one, then you have to write the mapping yourself.

The only thing that you have to take care of is you should access the array row-wise, because the C compiler will store your two-dimensional array row after row. If you access a 'big' two-dimensional array column-wise then page-faults are likely to happen. Even if you are programming in language that supports only one-dimensional arrays, you could easily write the mapping into any number of dimensions.

Take a look at this Wikipedia article if you want to do the mapping row-wise. Your mapping could be column-wise, like FORTRAN matrices for example.

Robert is correct. Indexing expressions are compiled to pointer arithmetic expressions and so there's no difference.

What can have an impact is access order however, and so you may want to implement things yourself so you can control the access order. For example column first vs row first forms.

On modern processors, accessing large arrays at various strides can have unexpected performance differences. Sequential access is always fastest, and other strides can be up to 30 times slower due to cache interactions. Multi dimensional arrays where the inner dimensions are a power of two often have poor performance because of they way they interact with the cache associativity. To understand these issues there's no real substitute for doing measurement.

I don't think there's any difference. Internally, c treats a two dimensional array like several one-dimensional arrays in sequence.

However, as with all things performance, your mileage may vary. There may be some kind of subtle pointer arithmetic difference. Run timed tests on both scenarios. Whichever one runs faster wins.

As said by other, the difference really is how you access your items: what matters if how your items are layout in the memory, which is linear, at least on common architectures. So all you have really is 1d array, the 2d, etc... is "just" a convenience, and a reasonable compiler should optimize the indexing - but in practice, once you have more than a few variables, compilers often fail on arch like x86 because of the register starvation.

Now, it depends on your application, but I think you should think with a 1d layout by default, especially, if you need to handle multiple dimensions. The first problem with multi-dimensional arrays in C is that you cannot dynamically allocate them - if you allocate on a per-row basis, you will have awful performances because you don't have a contiguous piece of memory. See the FFTW doc for details about this.

Note that you can always describe your single piece of memory with convenient array indexing on top of it (you allocate one big nxm memory block, and then create an array of n pointer to each row).

Performance of 2-dimensional array vs 1-dimensional array

Related

Recent Posts