Openmp：使用并行执行与 omp_get_thread_num()答案

【问题标题】：Openmp: use of parallel do with omp_get_thread_num()Openmp：使用并行执行与 omp_get_thread_num()
【发布时间】：2021-06-10 17:49:08
【问题描述】：

我正在使用 parallel do 和 private 子句拆分一个 do 循环。在这个循环中，我向自身添加了一个变量。如果在这种情况下不需要关键块或原子语句，为什么会出现错误？
我该如何解决？

program trap
use omp_lib 
implicit none
double precision::suma=0.d0 ! sum is a scalar
double precision:: h,x,lima,limb
integer::n,i, istart, iend, thread_num=4, total_threads, ppt
integer(kind=8):: tic, toc, rate
double precision:: time
double precision, dimension(4):: pi= 0.d0

call system_clock(count_rate = rate)
call system_clock(tic)

lima=0.0d0; limb=1.0d0; suma=0.0d0; n=100000000
h=(limb-lima)/n

suma=h*(f(lima)+f(limb))*0.5d0 !first and last points

ppt= n/total_threads
!$ call omp_set_num_threads(total_threads)

!$omp parallel do private (istart, iend, thread_num, i)
  thread_num = omp_get_thread_num()
  !$ istart = thread_num*ppt +1
  !$ iend = min(thread_num*ppt + ppt, n)
do i=istart,iend ! this will control the loop in different threads
  x=lima+i*h
  suma=suma+f(x) 
  pi(thread_num+1)=suma
enddo
!$omp end parallel do 

suma=sum(pi) 
suma=suma*h

print *,"The value of pi is= ",suma ! print once from the first image

call system_clock(toc)
time = real(toc-tic)/real(rate)
print*, 'Time ', time, 's'

contains

double precision function f(y)
double precision:: y
f=4.0d0/(1.0d0+y*y)
end function f

end program trap

我收到以下错误：

test.f90:23:35:

23 | thread_num = omp_get_thread_num()

错误：(1) 处出现意外的赋值语句

test.f90:24:31:

24 | !$ istart = thread_num*ppt +1

错误：(1) 处出现意外的赋值语句

test.f90:25:40:

25 | !$ iend = min(thread_num*ppt + ppt, n)

错误：(1) 处出现意外的赋值语句

编译：

gfortran -fopenmp -Wall -Wextra -O2 -Wall -o prog.exe test.f90 

./prog.exe

【问题讨论】：

为什么要手动设置循环边界而不是使用工作共享指令？这可以自动为您完成。
您是说 schedule 子句吗？如果我使用 workshare 指令，我需要手动拆分 do 循环（在本例中为 4 部分）。
@Isaac 不，工作共享指令都是：omd do、omp sections、omp workshare，也许还有一些我不记得了。在共享工作的线程之间传播工作的所有指令。

标签： multithreading parallel-processing fortran openmp gfortran

【解决方案1】：

我不明白为什么当 openmp 中的工作共享构造（例如 !$omp do）可以为您自动执行此操作时，您要手动拆分循环。以下是我的做法

ian@eris:~/work/stack$ cat thread.f90
program trap
  Use, Intrinsic :: iso_fortran_env, Only :  wp => real64, li => int64
  use omp_lib 
  implicit none
  Real( wp ) ::suma=0.0_wp ! sum is a scalar
  Real( wp ) :: h,x,lima,limb
  integer(li):: tic, toc, rate
  Real( wp ) :: time
  Real( wp ) :: pi
  Integer :: i, n

  call system_clock(count_rate = rate)
  call system_clock(tic)

  lima=0.0_wp; limb=1.0_wp; suma=0.0_wp; n=100000000
  h=(limb-lima)/n

  suma=h*(f(lima)+f(limb))*0.5_wp !first and last points

  pi = 0.0_wp
  !$omp parallel default( None ) private( i, x, lima ) &
  !$omp                          shared( pi, n, h )
  !$omp do reduction( +:pi )
  do i= 1, n
     x  = lima + i * h
     pi = pi + f( x ) 
  enddo
  !$omp end do
  !$omp end parallel

  print *,"The value of pi is= ", pi / n

  call system_clock(toc)
  time = real(toc-tic)/real(rate)
  print*, 'Time ', time, 's on ', omp_get_max_threads(), ' threads'

contains

  function f(y)
    Real( wp ) :: f
    Real( wp ) :: y
    f=4.0_wp/(1.0_wp+y*y)
  end function f

end program trap
ian@eris:~/work/stack$ gfortran --version
GNU Fortran (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

ian@eris:~/work/stack$ export OMP_NUM_THREADS=1
ian@eris:~/work/stack$ ./a.out
 The value of pi is=    3.1415926435902248     
 Time    1.8548842668533325      s on            1  threads
ian@eris:~/work/stack$ export OMP_NUM_THREADS=2
ian@eris:~/work/stack$ ./a.out
 The value of pi is=    3.1415926435902120     
 Time   0.86763000488281250      s on            2  threads
ian@eris:~/work/stack$ export OMP_NUM_THREADS=4
ian@eris:~/work/stack$ ./a.out
 The value of pi is=    3.1415926435898771     
 Time   0.54704123735427856      s on            4  threads
ian@eris:~/work/stack$

【讨论】：

尤其是在follow-up question的上下文中。
是的，其实这个答案在那里会更合适
除了错误...应该在循环之前设置 pi = suma 以包含端点，也许循环应该转到 n-1，需要快速思考。有时间会修复
@Isaac 你保存了一个迭代我会说，顺便说一句，我希望你不会不高兴我只是想帮忙:)
@dreamcrash Ups，我没有考虑循环以 0 开头。谢谢。我为什么要生气？不要担心一切都很好:)

【解决方案2】：

代替

!$omp parallel do private (istart, iend, thread_num, i)
  thread_num = omp_get_thread_num()
  !$ istart = thread_num*ppt +1
  !$ iend = min(thread_num*ppt + ppt, n)

尝试以下方法：

!$omp parallel private (istart, iend, thread_num, i)
  thread_num = omp_get_thread_num()
  !$ istart = thread_num*ppt +1
  !$ iend = min(thread_num*ppt + ppt, n)
....
!$omp end parallel

【讨论】：