Algorithm works in parallel for a small network but not for a large network (OpenMP)
Your code stops with an exception, because of a stack overflow:
These are the relevant lines:
#define L 15
#define N L*L
...
int Rede[N][N];
...
#pragma omp parallel firstprivate(Rede)...
When L=528
, N
is 528*528=278784
, Rede
requires N*N*sizeof(int)
bytes which is about 290 Gbyte (not to mention that each thread creates a local copy of Rede
).
UPDATE: Based on inicia_malha
function most probably Rede
should be declared as:
int Rede[L][L];
and you get rid of the overflow.
UPDATE2: As pointed out by @Gilles you have to create a random number generator for each thread. E.g:
#define MAX_THREADS 256
gsl_rng* r[MAX_THREADS];
for(int i=0;i<omp_get_max_threads();i++) {
r[i]= gsl_rng_alloc(gsl_rng_mt19937);
gsl_rng_set(r[i], seed);
//inicia a rede aleatóriamente
inicia_malha(r[i], Rede);
}
and you try to use nested parallelism, but most probably nested parallelism is disabled in your OpenMP implementation, so as a result your for
loop will be executed omp_get_num_threads()
times. That is the reason why you have to divide your values by number_T
, but it is an error.
Another comment you should use your variable at their minimum required scope, so putting it together your code should look something like this:
#define MAX_THREADS 256
gsl_rng* r[MAX_THREADS];
for(int i=0;i<omp_get_max_threads();i++)
{
r[i]= gsl_rng_alloc(gsl_rng_mt19937); //aloca espaço para o número aleatório de testagem
gsl_rng_set(r[i], seed);
//inicia a rede aleatóriamente
inicia_malha(r[i], Rede);
}
//loop da temperatura
for(;T>=Tmin;T-=dT)
{
//termalização
equilibra(r[0], J, B, k, T, Rede);
//I think it is correct and better if you set initial values here
double E_total=energia_total(J, B, Rede);
double E2_total=E_total*E_total;
double M_total=magnetizacao_total(Rede);
double M2_total=M_total*M_total;
double Mabs_total=abs(M_total);
#pragma omp parallel for default(none) shared(Msteps, J, B, k, T,r) \
firstprivate(Rede) reduction(+: E_total, E2_total, M_total, M2_total, Mabs_total)
//loop do Monte Carlo
for(unsigned int i=1;i<=Msteps;i++){
int pos[2], dE=0;
int thread_num=omp_get_thread_num();
double E=0, M=0;
//loop de Metropolis
for(unsigned int j=1;j<=N;j++){
escolhe_pos(pos, r[thread_num]);
if(testa_flip(pos, &dE, r[thread_num], J, B, k, T, Rede)){
//ajusta os observáveis
E+=2*dE;
M+=2*Rede[pos[0]][pos[1]];
}
}
//soma dos observavéis
E_total+=E;
E2_total+= E*E;
M_total+=M;
M2_total+= M*M;
Mabs_total+=abs(M);
}
//..... more code here
}