MPI中的并行循环问题答案

【问题标题】：Question about parallel loop in MPIMPI中的并行循环问题
【发布时间】：2011-04-10 17:16:09
【问题描述】：

你好，我有一个关于 fortran 中 openmpi 的简短问题：我有这样的代码：

I) definitions of vars & linear code, setting up some vars for later usage
II) a while loop which works like that in pseudocode:

nr=1
while(true)
{
  filename='result'//nr//'.bin' (nr converted to string)
  if(!file_exists(filename))
    goto 100

  // file exists... so do something with it
  // calculations, read/write...
  nr=nr+1
}
100 continue
III) some more linear code...

现在我想用 openmpi 进行并行计算。 I) 和 III) 中的线性代码应该只计算一次，while 循环应该在多个处理器上运行......如何最好地实现它？我的问题是while循环是如何工作的：例如当处理器1计算result1.bin时，如何直接告诉处理器2计算result2.bin？如果有 30 个文件并且我使用它会如何工作

mpirun -n 10 my_program

？ MPI如何“知道”在完成计算一个文件后，还有更多文件“等待”处理：一旦一个处理器结束处理一个文件，这个处理器应该直接重新开始处理队列中的下一个文件。

到目前为止谢谢！

编辑：

嘿，又是我……我也想尝试一下 OpenMP，所以我使用了你的一段代码来读取现有文件，然后循环它们（并处理它们）：

nfiles = 0
do
  write(filename,FMT='(A,I0,A)'), prefix, nfiles+1, suffix
  inquire(file=trim(filename),exist=exists)
  if (not(exists)) exit
    nfiles = nfiles + 1
enddo

现在我尝试了以下代码：

call omp_set_num_threads(2)
!$OMP PARALLEL
!$OMP DO 
do i=startnum, endnum
  write(filename,FMT='(A,I0,A)'), prefix, i, suffix
  ...CODE DIRECTLY HERE TO PROCESS THE FILE...
enddo
!$OMP END DO
!$OMP END PARALLEL

但它总是给我这样的错误： “从与 Open MP DO 或 PARALLEL DO 指令关联的 DO 循环中分支出来是非法的。”

总是关于这种代码的代码行：

read (F_RESULT,*,ERR=1) variable

其中 F_RESULT 是一个文件句柄...它有什么问题？变量是在循环块之外定义的，我已经尝试将 OpenMP 指令设置为

private(variable)

这样每个线程都有自己的副本，但这没有成功！到目前为止，感谢您的帮助！

【问题讨论】：

已编辑以将 openmpi 替换为 mpi；这不是特定于 openmpi 的。

标签： loops parallel-processing fortran openmp mpi

【解决方案1】：

可能最明智的方法是让其中一个进程预先计算文件总数，然后广播，然后让每个人都做“他们的”文件：

program processfiles
    use mpi
    implicit none

    integer :: rank, comsize, ierr
    integer :: nfiles
    character(len=6) :: prefix="result"
    character(len=4) :: suffix=".bin"
    character(len=50) :: filename
    integer :: i
    integer :: locnumfiles, startnum, endnum
    logical :: exists

    call MPI_Init(ierr)
    call MPI_Comm_size(MPI_COMM_WORLD, comsize, ierr)
    call MPI_Comm_rank(MPI_COMM_WORLD, rank, ierr)

    ! rank zero finds number of files
    if (rank == 0) then
       nfiles = 0
       do
           write(filename,FMT='(A,I0,A)'), prefix, nfiles+1, suffix
           inquire(file=trim(filename),exist=exists)
           if (not(exists)) exit
           nfiles = nfiles + 1
       enddo
    endif
    ! make sure everyone knows
    call MPI_Bcast(nfiles, 1, MPI_INTEGER, 0, MPI_COMM_WORLD, ierr)

    if (nfiles /= 0) then
        ! calculate who gets what file
        locnumfiles = nfiles/comsize
        if (locnumfiles * comsize /= nfiles) locnumfiles = locnumfiles + 1
        startnum = locnumfiles * rank + 1
        endnum = startnum + locnumfiles - 1
        if (rank == comsize-1) endnum = nfiles
        do i=startnum, endnum
           write(filename,FMT='(A,I0,A)'), prefix, i, suffix
           call processfile(rank,filename)
        enddo
    else
        if (rank == 0) then
            print *,'No files found; exiting.'
        endif
    endif
    call MPI_Finalize(ierr)

    contains
        subroutine processfile(rank,filename)
            implicit none
            integer, intent(in) :: rank
            character(len=*), intent(in) :: filename
            integer :: unitno
            open(newunit=unitno, file=trim(filename))
            print '(I4,A,A)',rank,': Processing file ', filename
            close(unitno)
        end subroutine processfile
end program processfiles

然后是一个简单的测试：

$ seq 1 33 | xargs -I num touch "result"num".bin"
$ mpirun -np 2 ./processfiles

   0: Processing file result1.bin                                       
   0: Processing file result2.bin                                       
   0: Processing file result3.bin                                       
   0: Processing file result4.bin                                       
   0: Processing file result5.bin                                       
   0: Processing file result6.bin                                       
   1: Processing file result18.bin                                      
   0: Processing file result7.bin                                       
   0: Processing file result8.bin                                       
   1: Processing file result19.bin                                      
   0: Processing file result9.bin                                       
   1: Processing file result20.bin                                      
   0: Processing file result10.bin                                      
   1: Processing file result21.bin                                      
   1: Processing file result22.bin                                      
   0: Processing file result11.bin                                      
   1: Processing file result23.bin                                      
   0: Processing file result12.bin                                      
   1: Processing file result24.bin                                      
   1: Processing file result25.bin                                      
   0: Processing file result13.bin                                      
   0: Processing file result14.bin                                      
   1: Processing file result26.bin                                      
   1: Processing file result27.bin                                      
   0: Processing file result15.bin                                      
   0: Processing file result16.bin                                      
   1: Processing file result28.bin                                      
   1: Processing file result29.bin                                      
   1: Processing file result30.bin                                      
   0: Processing file result17.bin                                      
   1: Processing file result31.bin                                      
   1: Processing file result32.bin                                      
   1: Processing file result33.bin

更新以添加补充 OpenMP 问题：

所以第一个循环是在文件的并行处理开始之前计算文件数量的地方。文件的计数需要在文件的并行处理发生之前完成，否则就不可能在处理器之间划分工作；在划分工作之前，您需要知道有多少个“工作单元”。（这绝对不是唯一的做事方式，但它是最直接的）。

同样，OMP DO 循环需要非常结构化的循环——需要有一个像do i=1,n 这样的简单循环，然后可以很容易地在线程之间分解。 n 不需要编译进去，增量甚至不需要是一，但它必须是在实际执行循环之前可以确定的那种东西。因此，例如，由于某些外部原因（例如文件不存在），您无法退出循环。

因此，您希望使用 OpenMP 执行相同的文件计数，然后不理会它，但随后在处理循环中，使用并行 do 构造。所以，在去掉 MPI 的东西之后，你会得到如下所示的东西：

    do
        write(filename,FMT='(A,I0,A)'), prefix, nfiles+1, suffix
        inquire(file=trim(filename),exist=exists)
        if (.not.exists) exit
        nfiles = nfiles + 1
    enddo

    if (nfiles /= 0) then
        !$OMP PARALLEL SHARED(nfiles,prefix,suffix) PRIVATE(i,thread,filename)
        thread = omp_get_thread_num()
        !$OMP DO 
        do i=1, nfiles
           write(filename,FMT='(A,I0,A)'), prefix, i, suffix
           call processfile(thread,filename)
        enddo
        !$OMP END DO
        !$OMP END PARALLEL 
    else
        print *,'No files found; exiting.'
    endif

但其他一切都是一样的。同样，如果您想“内联”处理文件（例如，不在 sburoutine 中），您可以将文件处理代码放在“call processfile()”行所在的位置。

【讨论】：

哦，酷，看起来很酷……唯一的问题是，我很难将现有代码分成“子例程”，因为所有定义的变量等都存在并在while-loop（在while-loop之前定义/声明）......有什么“更简单”的方法吗？不过谢谢，我一定会尝试一下，并考虑将我现有的代码分成几部分！
嗯，当然，你可以把处理放在上面调用子程序的地方。不过，从长远来看，您会发现如果您确实将软件分解为函数和子例程，那么您的软件维护起来会容易得多——更容易看到发生了什么。
嘿，编辑了我的原始帖子，因为我无法在此处的评论中发布足够的代码...
谢谢。如您所见，我已经“知道”，因为我已经编写了上面那样的代码......我的问题是指我尝试编译代码时遇到的错误。我只是找不到它有什么问题......编辑：啊，现在我知道了：可能“ERR = 1”是错误的，因为错误会使代码跳出不允许的循环...我会在没有 ERR=1 块的情况下尝试它:)
是的，您最多可以使用与线程一样多的处理器；对于 OpenMP，您不需要在代码中显式设置线程数；如果您什么都不做，OpenMP 将自动使用与您的处理器一样多的线程。如果你想改变它，你可以在运行程序之前设置 OMP_NUM_THREADS 环境变量，而不用改变一行代码。