Fortran OpenMP with subroutines and functions
Disclaimer: I'm quite certain that this has been answered somewhere, but myself and another person have been searching quite hard to no avail.
I've got a code that looks something like this:
PROGRAM main
!$omp parallel do
!$omp private(somestuff) shared(otherstuff)
DO i=1,n
...
CALL mysubroutine(args)
...
a=myfunction(moreargs)
...
ENDDO
!$omp end parallel do
END PROGRAM
SUBROUTINE mysubroutine(things)
...
END SUBROUTINE
FUNCTION myfunction(morethings)
...
END FUNCTION
I cannot determine where/how to handle private, shared, reduction, etc. clauses for the variables in the subroutine and function. I suspect there may be some nuances to the answer, as there are many, many ways variables might have been declared and shared amongst them. So, let's say all variables that the main program are concerned with were defined in it or in shared modules, and that any OMP operations on those variables can be handled in the main code. The subroutines and functions use some of those variables, and have some of their own variables. So, I think the question boils down to how to handle clauses for their local variables.
Solution 1:
OK, this is about the difference between the lexical and dynamic extent of OpenMP directives and the interaction with variable scoping. The lexical extent of a directive is the text between the beginning and the end of the structured block following a directive. The dynamic extent is the lexical extent plus statements executed as part of any subprogram executed as a result of statements in the lexical extent. So in something like
Program stuff
Implicit None
Real, Dimension( 1:100 ) :: a
Call Random_number( a )
!$omp parallel default( none ) shared( a )
Call sub( a )
!$omp end parallel
Contains
Subroutine sub( a )
Real, Dimension( : ), Intent( InOut ) :: a
Integer :: i
!$omp do
Do i = 1, Size( a )
a( i ) = 2.0 * a( i )
End Do
End Subroutine Sub
End Program stuff
(totally untested, written direct in here) the lexical extent of the parallel region initiated by !$omp parallel is just
Call sub( a )
while the dynamic extent is the call and the contents of the subroutine. And for completeness of terminology the !$omp do is an example of an orphaned directive, a directive that is not in the lexical extent of another directive, but in the dynamic extent. See
https://computing.llnl.gov/tutorials/openMP/#Scoping
for another example.
Why does this matter? Well you can only explicitly scope variables for entities in the lexical scope, just the array a in this case, but for entities that become defined due to the dynamic extent being executed you can't do this! Instead OpenMP has a number of rules, which in simple terms are
- The scope of subprogram arguments is inherited from the calling point, i.e. if you scoped them as private at the start of the lexical extent they stay private, and if shared they stay shared
- Local variables to subprograms are by default private (so i in the above is private, as you want) unless they are declared with the SAVE attribute (either explicitly or implicitly) in which case they are shared.
And in my experience that gets you most of the way! Use of the dynamic extent combined with orphaned directives is a great way to keep an OpenMP program under control, and I have to say I disagree with the above comment, I find orphaned workshare directives very useful indeed! So you can combine all of the above to do things like
Program dot_test
Implicit None
Real, Dimension( 1:100 ) :: a, b
Real :: r
a = 1.0
b = 2.0
!$omp parallel default( none ) shared( a, b, r )
Call dot( a, b, r )
Write( *, * ) r
!$omp end parallel
Contains
Subroutine dot( a, b, r )
Real, Dimension( : ), Intent( In ) :: a, b
Real, Intent( Out ) :: r
Real, Save :: s
Integer :: i
!$omp single
s = 0.0
!$omp end single
!$omp do reduction( +:s )
Do i = 1, Size( a )
s = s + a( i ) * b( i )
End Do
!$omp end do
!$omp single
r = s
!$omp end single
End Subroutine dot
End Program dot_test
Wot now? gfortran -std=f95 -fopenmp -O -Wall -Wextra dot.f90
Wot now? export OMP_NUM_THREADS=3
Wot now? ./a.out
200.000000
200.000000
200.000000
This simple situation is a bit complicated by module variables and common blocks, so don't use global variables ... but if you must they are by default shared unless declared as threadprivate. See
https://computing.llnl.gov/tutorials/openMP/#THREADPRIVATE
for an example.