How can I send rows of a matrix to all the processes using MPI_Scatterv?

I am working with the MPI interface. I want to split a matrix (by rows) and distribute the parts among every process.

For example, I have this 7x7 square matrix M.

M = [
    0.00    1.00    2.00    3.00    4.00    5.00    6.00    
    7.00    8.00    9.00    10.00   11.00   12.00   13.00
    14.00   15.00   16.00   17.00   18.00   19.00   20.00
    21.00   22.00   23.00   24.00   25.00   26.00   27.00
    28.00   29.00   30.00   31.00   32.00   33.00   34.00
    35.00   36.00   37.00   38.00   39.00   40.00   41.00
    42.00   43.00   44.00   45.00   46.00   47.00   48.00
];

I have 3 processes, so the split could be:

  • Process 0 gets rows 0, 1
  • Process 1 gets rows 2, 3, 4
  • Process 2 gets rows 5, 6

After the Scatterv, it should look like this:

Process 0:
M0 = [
    0.00    1.00    2.00    3.00    4.00    5.00    6.00    
    7.00    8.00    9.00    10.00   11.00   12.00   13.00
];

Process 1:
M1 = [
    14.00   15.00   16.00   17.00   18.00   19.00   20.00
    21.00   22.00   23.00   24.00   25.00   26.00   27.00
    28.00   29.00   30.00   31.00   32.00   33.00   34.00
];

Process 2:
M2 = [
    35.00   36.00   37.00   38.00   39.00   40.00   41.00
    42.00   43.00   44.00   45.00   46.00   47.00   48.00
];

I think I was clear enough about what I want to achieve. Feel free to ask if I didn't explain well.

Now, I show you my code:

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

#define BLOCK_LOW(id,p,n) ((id)*(n)/(p))
#define BLOCK_HIGH(id,p,n) ((id+1)*(n)/(p) - 1)
#define BLOCK_SIZE(id,p,n) ((id+1)*(n)/(p) - (id)*(n)/(p))
#define BLOCK_OWNER(index,p,n) (((p)*((index)+1)-1)/(n))

void **matrix_create(size_t m, size_t n, size_t size) {
   size_t i; 
   void **p= (void **) malloc(m*n*size+ m*sizeof(void *));
   char *c=  (char*) (p+m);
   for(i=0; i<m; ++i)
      p[i]= (void *) c+i*n*size;
   return p;
}

void matrix_print(double **M, size_t m, size_t n, char *name) {
    size_t i,j;
    printf("%s=[",name);
    for(i=0; i<m; ++i) {
        printf("\n  ");
        for(j=0; j<n; ++j)
            printf("%f  ",M[i][j]);
    }
    printf("\n];\n");
}

main(int argc, char *argv[]) {

    int npes, myrank, root = 0, n = 7, rows, i, j, *sendcounts, *displs;
    double **m, **mParts;

    MPI_Status status;
    MPI_Init(&argc, &argv);
    MPI_Comm_size(MPI_COMM_WORLD,&npes);
    MPI_Comm_rank(MPI_COMM_WORLD,&myrank);

    // Matrix M is generated in the root process (process 0)
    if (myrank == root) {
        m = (double**)matrix_create(n, n, sizeof(double));
        for (i = 0; i < n; ++i)
            for (j = 0; j < n; ++j)
                m[i][j] = (double)(n * i + j);
    }

    // Array containing the numbers of rows for each process
    sendcounts = malloc(n * sizeof(int));
    // Array containing the displacement for each data chunk
    displs = malloc(n * sizeof(int));
    // For each process ...
    for (j = 0; j < npes; j++) {
        // Sets each number of rows
        sendcounts[j] = BLOCK_SIZE(j, npes, n);
        // Sets each displacement
        displs[j] = BLOCK_LOW(j, npes, n);
    }
    // Each process gets the number of rows that he is going to get
    rows = sendcounts[myrank];
    // Creates the empty matrixes for the parts of M
    mParts = (double**)matrix_create(rows, n, sizeof(double));
    // Scatters the matrix parts through all the processes
    MPI_Scatterv(m, sendcounts, displs, MPI_DOUBLE, mParts, rows, MPI_DOUBLE, root, MPI_COMM_WORLD);

    // This is where I get the Segmentation Fault
    if (myrank == 1) matrix_print(mParts, rows, n, "mParts");

    MPI_Finalize();
}

I get a Segmentation Fault when I try to read the scattered data, suggesting that the scatter operation did not work. I already did this with one-dimensional arrays and it worked. But with two-dimensional arrays, things get a little trickier.

Could you help me find the bug, please?

Thanks


MPI_Scatterv needs a pointer to the data and the data should be contiguous in memory. Your program is fine on the second part, but MPI_Scatterv receives a pointer to pointers to data. So it would be a good thing to change for :

MPI_Scatterv(&m[0][0], sendcounts, displs, MPI_DOUBLE, &mParts[0][0], sendcounts[myrank], MPI_DOUBLE, root, MPI_COMM_WORLD);

There are also a couple of things to change for sendcounts and displs : to go 2D, these counts should be multiplied by n. And the count of receive in MPI_Scatterv is not rows anymore, but sendcouts[myrank].

Here is the final code :

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

#define BLOCK_LOW(id,p,n) ((id)*(n)/(p))
#define BLOCK_HIGH(id,p,n) ((id+1)*(n)/(p) - 1)
#define BLOCK_SIZE(id,p,n) ((id+1)*(n)/(p) - (id)*(n)/(p))
#define BLOCK_OWNER(index,p,n) (((p)*((index)+1)-1)/(n))

void **matrix_create(size_t m, size_t n, size_t size) {
    size_t i; 
    void **p= (void **) malloc(m*n*size+ m*sizeof(void *));
    char *c=  (char*) (p+m);
    for(i=0; i<m; ++i)
        p[i]= (void *) c+i*n*size;
    return p;
}

void matrix_print(double **M, size_t m, size_t n, char *name) {
    size_t i,j;
    printf("%s=[",name);
    for(i=0; i<m; ++i) {
        printf("\n  ");
        for(j=0; j<n; ++j)
            printf("%f  ",M[i][j]);
    }
    printf("\n];\n");
}

main(int argc, char *argv[]) {

    int npes, myrank, root = 0, n = 7, rows, i, j, *sendcounts, *displs;
    double **m, **mParts;

    MPI_Status status;
    MPI_Init(&argc, &argv);
    MPI_Comm_size(MPI_COMM_WORLD,&npes);
    MPI_Comm_rank(MPI_COMM_WORLD,&myrank);

    // Matrix M is generated in the root process (process 0)
    if (myrank == root) {
        m = (double**)matrix_create(n, n, sizeof(double));
        for (i = 0; i < n; ++i)
            for (j = 0; j < n; ++j)
                m[i][j] = (double)(n * i + j);
    }

    // Array containing the numbers of rows for each process
    sendcounts = malloc(n * sizeof(int));
    // Array containing the displacement for each data chunk
    displs = malloc(n * sizeof(int));
    // For each process ...
    for (j = 0; j < npes; j++) {
        // Sets each number of rows
        sendcounts[j] = BLOCK_SIZE(j, npes, n)*n;
        // Sets each displacement
        displs[j] = BLOCK_LOW(j, npes, n)*n;
    }
    // Each process gets the number of rows that he is going to get
    rows = sendcounts[myrank]/n;
    // Creates the empty matrixes for the parts of M
    mParts = (double**)matrix_create(rows, n, sizeof(double));
    // Scatters the matrix parts through all the processes
    MPI_Scatterv(&m[0][0], sendcounts, displs, MPI_DOUBLE, &mParts[0][0], sendcounts[myrank], MPI_DOUBLE, root, MPI_COMM_WORLD);

    // This is where I get the Segmentation Fault
    if (myrank == 1) matrix_print(mParts, rows, n, "mParts");

    MPI_Finalize();
}

If you want to know more about 2D arrays and MPI, look here

Look also at the DMDA structure of the PETSc library here and there