MPI_Scatter - sending columns of 2D array
Solution 1:
This is very similar to this question: How to MPI_Gatherv columns from processor, where each process may send different number of columns . The issue is that columns aren't contiguous in memory, so you have to play around.
As is always the case in C, lacking real multidimensional arrays, you have to be a little careful about memory layout. I believe in C it's the case that a statically-declared array like
float a[nrows][ncols]
will be contiguous in memory, so you should be alright for now. However, be aware that as soon as you go to dynamic allocation, this will no longer be the case; you'd have to allocate all the data at once to make sure that you get contiguous data, eg
float **floatalloc2d(int n, int m) {
float *data = (float *)malloc(n*m*sizeof(float));
float **array = (float **)calloc(n*sizeof(float *));
for (int i=0; i<n; i++)
array[i] = &(data[i*m]);
return array;
}
float floatfree2d(float **array) {
free(array[0]);
free(array);
return;
}
/* ... */
float **a;
nrows = 3;
ncols = 2;
a = floatalloc2d(nrows,ncols);
but I think you're ok for now.
Now that you have your 2d array one way or another, you have to create your type. The type you've described is fine if you are just sending one column; but the trick here is that if you're sending multiple columns, each column starts only one float past the start of the previous one, even though the column itself spans almost the whole array! So you need to move the upper bound of the type for this to work:
MPI_Datatype col, coltype;
MPI_Type_vector(nrows,
1,
ncols,
MPI_FLOAT,
&col);
MPI_Type_commit(&col);
MPI_Type_create_resized(col, 0, 1*sizeof(float), &coltype);
MPI_Type_commit(&coltype);
will do what you want. NOTE that the receiving processes will have different types than the sending process, because they are storing a smaller number of columns; so the stride between elements is smaller.
Finally, you can now do your scatter,
MPI_Comm_size(MPI_COMM_WORLD,&size);
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
if (rank == 0) {
a = floatalloc2d(nrows,ncols);
sendptr = &(a[0][0]);
} else {
sendptr = NULL;
}
int ncolsperproc = ncols/size; /* we're assuming this divides evenly */
b = floatalloc(nrows, ncolsperproc);
MPI_Datatype acol, acoltype, bcol, bcoltype;
if (rank == 0) {
MPI_Type_vector(nrows,
1,
ncols,
MPI_FLOAT,
&acol);
MPI_Type_commit(&acol);
MPI_Type_create_resized(acol, 0, 1*sizeof(float), &acoltype);
}
MPI_Type_vector(nrows,
1,
ncolsperproc,
MPI_FLOAT,
&bcol);
MPI_Type_commit(&bcol);
MPI_Type_create_resized(bcol, 0, 1*sizeof(float), &bcoltype);
MPI_Type_commit(&bcoltype);
MPI_Scatter (sendptr, ncolsperproc, acoltype, &(b[0][0]), ncolsperproc, bcoltype, 0, MPI_COMM_WORLD);
Solution 2:
There's quite a few things wrong with that, but your main problem is memory layout. At the memory location denoted by a
, there isn't a single float
: there are only float*
s that point to various arrays of float
elsewhere in memory. Since these arrays are not necessarily contiguous, you can't use Scatter on them.
The easiest solution would be to store your matrix in a single array:
float a[100*101];
And fill it in column-major order. Then simply Scatter like so:
MPI_Scatter(a, 100*101, MPI_FLOAT, send, 10*101, MPI_FLOAT, 0, MPI_COMM_WORLD);
This is assuming that you Scatter between 10 processes and send
is defined as a float[10*101]
in each process. Note that in the code you posted, arguments 4-6 of Scatter are definitely flawed. If send
is an array, then you don't need to pass &send
(for the same reason you don't need to pass &a
in the first argument), and you want to match the number and type of data items you receive to what you send.