【问题标题】:How to Send/Receive in MPI using all processors如何使用所有处理器在 MPI 中发送/接收
【发布时间】:2017-02-19 10:37:06
【问题描述】:

这个程序使用 C Lagrange 和 MPI 编写。我是 MPI 新手,想使用所有处理器进行一些计算,包括进程 0。为了学习这个概念,我编写了以下简单程序。但是这个程序在接收到来自进程0的输入后就挂在了底部,并且不会将结果发送回进程0。

#include <mpi.h>
#include <stdio.h>

int main(int argc, char** argv) {    
    MPI_Init(&argc, &argv);
    int world_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
    int world_size;
    MPI_Comm_size(MPI_COMM_WORLD, &world_size);

    int number;
    int result;
    if (world_rank == 0) 
    {
        number = -2;
        int i;
        for(i = 0; i < 4; i++)
        {
            MPI_Send(&number, 1, MPI_INT, i, 0, MPI_COMM_WORLD);
        }
        for(i = 0; i < 4; i++)
        {           /*Error: can't get result send by other processos bellow*/
            MPI_Recv(&number, 1, MPI_INT, i, 99, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
            printf("Process 0 received number %d from i:%d\n", number, i);
        }
    } 
    /*I want to do this without using an else statement here, so that I can use process 0 to do some calculations as well*/

    MPI_Recv(&number, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE); 
    printf("*Process %d received number %d from process 0\n",world_rank, number);
    result = world_rank + 1;
    MPI_Send(&result, 1, MPI_INT, 0, 99, MPI_COMM_WORLD);  /* problem happens here when trying to send result back to process 0*/

    MPI_Finalize();
}

运行并获得结果:

:$ mpicc test.c -o test
:$ mpirun -np 4 test

*Process 1 received number -2 from process 0
*Process 2 received number -2 from process 0
*Process 3 received number -2 from process 0
/* hangs here and will not continue */

如果可以的话,请给我一个例子或者如果可能的话编辑上面的代码。

【问题讨论】:

    标签: c parallel-processing mpi


    【解决方案1】:

    我真的不明白在工作域周围使用 2 个 if 语句会有什么问题。但无论如何,这里有一个可以做的例子。

    我修改了您的代码以使用集体通信,因为它们比您使用的一系列发送/接收更有意义。由于初始通信具有统一的值,因此我使用 MPI_Bcast(),它在一次调用中执行相同的操作。
    相反,由于结果值都不同,调用MPI_Gather() 是完全合适的。
    我还引入了对sleep() 的调用,只是为了模拟进程在发回结果之前工作了一段时间。

    现在的代码如下所示:

    #include <mpi.h>
    #include <stdlib.h>   // for malloc and free
    #include <stdio.h>    // for printf
    #include <unistd.h>   // for sleep
    
    int main( int argc, char *argv[] ) {
    
        MPI_Init( &argc, &argv );
        int world_rank;
        MPI_Comm_rank( MPI_COMM_WORLD, &world_rank );
        int world_size;
        MPI_Comm_size( MPI_COMM_WORLD, &world_size );
    
        // sending the same number to all processes via broadcast from process 0
        int number = world_rank == 0 ? -2 : 0;
        MPI_Bcast( &number, 1, MPI_INT, 0, MPI_COMM_WORLD );
        printf( "Process %d received %d from process 0\n", world_rank, number );
    
        // Do something usefull here
        sleep( 1 );
        int my_result = world_rank + 1;
    
        // Now collecting individual results on process 0
        int *results = world_rank == 0 ? malloc( world_size * sizeof( int ) ) : NULL;
        MPI_Gather( &my_result, 1, MPI_INT, results, 1, MPI_INT, 0, MPI_COMM_WORLD );
    
        // Process 0 prints what it collected
        if ( world_rank == 0 ) {
            for ( int i = 0; i < world_size; i++ ) {
                printf( "Process 0 received result %d from process %d\n", results[i], i );
            }
            free( results );
        }
    
        MPI_Finalize();
    
        return 0;
    }
    

    编译后如下:

    $ mpicc -std=c99 simple_mpi.c -o simple_mpi
    

    它运行并给出:

    $ mpiexec -n 4 ./simple_mpi
    Process 0 received -2 from process 0
    Process 1 received -2 from process 0
    Process 3 received -2 from process 0
    Process 2 received -2 from process 0
    Process 0 received result 1 from process 0
    Process 0 received result 2 from process 1
    Process 0 received result 3 from process 2
    Process 0 received result 4 from process 3
    

    【讨论】:

      【解决方案2】:

      实际上,进程 1-3 确实将结果发送回处理器 0。但是,处理器 0 卡在此循环的第一次迭代中:

      for(i=0; i<4; i++)
      {      
          MPI_Recv(&number, 1, MPI_INT, i, 99, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
          printf("Process 0 received number %d from i:%d\n", number, i);
      }
      

      在第一次 MPI_Recv 调用中,处理器 0 将阻塞等待接收来自自身的带有标签 99 的消息,即 0 尚未发送的消息。

      一般来说,处理器向自己发送/接收消息是个坏主意,尤其是使用阻塞调用。 0 已经有内存中的值。它不需要发送给它自己。

      但是,一种解决方法是从i=1 开始接收循环

      for(i=1; i<4; i++)
      {           
          MPI_Recv(&number, 1, MPI_INT, i, 99, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
          printf("Process 0 received number %d from i:%d\n", number, i);
      }
      

      现在运行代码会给你:

      Process 1 received number -2 from process 0
      Process 2 received number -2 from process 0
      Process 3 received number -2 from process 0
      Process 0 received number 2 from i:1
      Process 0 received number 3 from i:2
      Process 0 received number 4 from i:3
      Process 0 received number -2 from process 0
      

      请注意,使用 Gilles 提到的 MPI_Bcast 和 MPI_Gather 是一种更有效和更标准的数据分发/收集方式。

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2018-03-12
        • 2021-09-27
        • 2014-01-04
        • 2013-01-28
        • 2018-12-04
        • 2017-03-04
        • 2013-04-15
        • 1970-01-01
        相关资源
        最近更新 更多