Large 2D array gives segmentation fault
I am writing some C++ code in Linux where I have declared a few 2D arrays like so:
double x[5000][500], y[5000][500], z[5000][500];
During compilation there is no error. When I execute it says "segmentation fault".
Wen I reduce the size of the array from 5000 to 50, the program runs fine. How can I protect myself against this problem?
If your program looks like this ...
int main(int, char **) {
double x[5000][500],y[5000][500],z[5000][500];
// ...
return 0;
}
... then you are overflowing the stack. The fastest way to fix this is to add the word static.
int main(int, char **) {
static double x[5000][500],y[5000][500],z[5000][500];
// ...
return 0;
}
The second fastest way to fix this is to move the declaration out of the function:
double x[5000][500],y[5000][500],z[5000][500];
int main(int, char **) {
// ...
return 0;
}
The third fastest way to fix this is to allocate the memory on the heap:
int main(int, char **) {
double **x = new double*[5000];
double **y = new double*[5000];
double **z = new double*[5000];
for (size_t i = 0; i < 5000; i++) {
x[i] = new double[500];
y[i] = new double[500];
z[i] = new double[500];
}
// ...
for (size_t i = 5000; i > 0; ) {
delete[] z[--i];
delete[] y[i];
delete[] x[i];
}
delete[] z;
delete[] y;
delete[] x;
return 0;
}
The fourth fastest way is to allocate them on the heap using std::vector. It is fewer lines in your file but more lines in the compilation unit, and you must either think of a meaningful name for your derived vector types or tuck them into an anonymous namespace so they won't pollute the global namespace:
#include <vector>
using std::vector
namespace {
struct Y : public vector<double> { Y() : vector<double>(500) {} };
struct XY : public vector<Y> { XY() : vector<Y>(5000) {} } ;
}
int main(int, char **) {
XY x, y, z;
// ...
return 0;
}
The fifth fastest way is to allocate them on the heap, but use templates so the dimensions are not so remote from the objects:
include <vector>
using namespace std;
namespace {
template <size_t N>
struct Y : public vector<double> { Y() : vector<double>(N) {} };
template <size_t N1, size_t N2>
struct XY : public vector< Y<N2> > { XY() : vector< Y<N2> > (N1) {} } ;
}
int main(int, char **) {
XY<5000,500> x, y, z;
XY<500,50> mini_x, mini_y, mini_z;
// ...
return 0;
}
The most performant way is to allocate the two-dimensional arrays as one-dimensional arrays, and then use index arithmetic.
All the above assumes that you have some reason, a good one or a poor one, for wanting to craft your own multidimensional array mechanism. If you have no reason, and expect to use multidimensional arrays again, strongly consider installing a library:
A plays-nicely-with-STL way is to use the Boost Multidimensional Array.
A speed way is to use Blitz++.
These arrays are on the stack. Stacks are quite limited in size. You probably run into a ... stack overflow :)
If you want to avoid this, you need to put them on the free store:
double* x =new double[5000*5000];
But you better start the good habit of using the standard containers, which wrap all this for you:
std::vector< std::vector<int> > x( std::vector<int>(500), 5000 );
Plus: even if the stack fits the arrays, you still need room for functions to put their frames on it.
You may want to try and use Boost.Multi_array
typedef boost::multi_array<double, 2> Double2d;
Double2d x(boost::extents[5000][500]);
Double2d y(boost::extents[5000][500]);
Double2d z(boost::extents[5000][500]);
The actual large memory chunk will be allocated on the heap and automatically deallocated when necessary.