Algorithm works in parallel for a small network but not for a large network (OpenMP)

Your code stops with an exception, because of a stack overflow:

These are the relevant lines:

#define L 15
#define N  L*L 
...
int Rede[N][N];
...
#pragma omp parallel firstprivate(Rede)...

When L=528, N is 528*528=278784, Rede requires N*N*sizeof(int) bytes which is about 290 Gbyte (not to mention that each thread creates a local copy of Rede).

UPDATE: Based on inicia_malha function most probably Rede should be declared as:

int Rede[L][L];

and you get rid of the overflow.

UPDATE2: As pointed out by @Gilles you have to create a random number generator for each thread. E.g:

#define MAX_THREADS 256
gsl_rng* r[MAX_THREADS];
for(int i=0;i<omp_get_max_threads();i++) {
   r[i]= gsl_rng_alloc(gsl_rng_mt19937);
   gsl_rng_set(r[i], seed);
   //inicia a rede aleatóriamente
   inicia_malha(r[i], Rede);
}

and you try to use nested parallelism, but most probably nested parallelism is disabled in your OpenMP implementation, so as a result your for loop will be executed omp_get_num_threads() times. That is the reason why you have to divide your values by number_T, but it is an error.

Another comment you should use your variable at their minimum required scope, so putting it together your code should look something like this:

#define MAX_THREADS 256   
gsl_rng* r[MAX_THREADS];
for(int i=0;i<omp_get_max_threads();i++)
{
    r[i]= gsl_rng_alloc(gsl_rng_mt19937);        //aloca espaço para o número aleatório de testagem
    gsl_rng_set(r[i], seed);
    //inicia a rede aleatóriamente
    inicia_malha(r[i], Rede);
}

//loop da temperatura
for(;T>=Tmin;T-=dT)
{
    //termalização
    equilibra(r[0], J, B, k, T, Rede);
    
    //I think it is correct and better if you set initial values here         
    double E_total=energia_total(J, B, Rede);
    double E2_total=E_total*E_total;
    double M_total=magnetizacao_total(Rede);
    double M2_total=M_total*M_total;
    double Mabs_total=abs(M_total);

    #pragma omp parallel for default(none) shared(Msteps, J, B, k, T,r) \
    firstprivate(Rede) reduction(+: E_total, E2_total, M_total, M2_total, Mabs_total)      
    //loop do Monte Carlo
    for(unsigned int i=1;i<=Msteps;i++){
        int  pos[2], dE=0;
        int thread_num=omp_get_thread_num();
        double E=0, M=0;
        //loop de Metropolis
        for(unsigned int j=1;j<=N;j++){
            escolhe_pos(pos, r[thread_num]);                
            if(testa_flip(pos, &dE, r[thread_num], J, B, k, T, Rede)){
                //ajusta os observáveis
                E+=2*dE;
                M+=2*Rede[pos[0]][pos[1]];
            }
        }            
        //soma dos observavéis
        E_total+=E;
        E2_total+= E*E;
        M_total+=M;
        M2_total+= M*M;
        Mabs_total+=abs(M);
    }
    //..... more code here
}

Algorithm works in parallel for a small network but not for a large network (OpenMP)

Related

Recent Posts