使用 MPI_Isend 时出现分段错误答案

【问题标题】：Segmentation Fault when using MPI_Isend使用 MPI_Isend 时出现分段错误
【发布时间】：2019-06-10 16:19:12
【问题描述】：

我的程序的目的是计算内导体和外导体之间的静电势，方法是将其分成网格，然后分成网格切片。每个处理器获得一个切片并在每个切片上运行计算。我使用 MPI_Isend 和 MPI_Irecv 在处理器之间发送数据。测试代码时出现分段错误：

[physnode5:81440] *** Process received signal ***
[physnode5:81440] Signal: Segmentation fault (11)
[physnode5:81440] Signal code: Address not mapped (1)
[physnode5:81440] Failing at address: 0x58
[physnode5:81440] [ 0] /lib64/libpthread.so.0(+0xf5d0)[0x2ab8069df5d0]
[physnode5:81440] [ 1] /opt/yarcc/libraries/openmpi/2.1.0/1/default/lib/libmpi.so.20(ompi_request_default_wait+0xd)[0x2ab8066495ed]
[physnode5:81440] [ 2] /opt/yarcc/libraries/openmpi/2.1.0/1/default/lib/libmpi.so.20(MPI_Wait+0x5d)[0x2ab80667a00d]
[physnode5:81440] [ 3] ./mpi_tezt.exe[0x400ffc]
[physnode5:81440] [ 4] /lib64/libc.so.6(__libc_start_main+0xf5)[0x2ab806c0e3d5]
[physnode5:81440] [ 5] ./mpi_tezt.exe[0x4009b9]
[physnode5:81440] *** End of error message ***

当这段代码被执行时。请不要我已经 ssh 到一个集群。文件名是 mpi_tezt.exe（是的，我拼错了）。我已经检查了我要发送的数组是否正确分配，并且 send 和 recv 没有发送或接收不存在的数据（即发送数组范围之外的数据。我的 MPI_Isend 和 MPI_Irecv 代码如下：

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <time.h>
#include <mpi.h>

int main(int argc, char *argv[])
{
  /*MPI Specific Variables*/
  int my_size, my_rank, up, down;
  MPI_Request reqU, reqD, sreqU, sreqD;
  MPI_Status rUstatus, rDstatus, sUstatus, sDstatus;

   /*Physical Dimensions*/
  double Linner = 5.0;/*mm*/
  double Rinner = 1.0;/*mm*/
  double phi_0 = 1000.0;/*V*/

  /*Other Variables*/
  int grid_size = 100;
  int slice;
  int x,y;
  double grid_res_y = 0.2;
  double grid_res_x = 0.1;
  int xboundary, yboundary;

  MPI_Init(&argc, &argv);
  MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
  MPI_Comm_size(MPI_COMM_WORLD, &my_size);

  /*Determining neighbours*/
  if (my_rank != 0) /*if statemets used to stop highest and lowest rank neighbours arent outside 0 - my_size-1 range of ranks*/
    {
      up = my_rank-1;
    }
  else
    {
      up = 0;
    }

  if(my_rank != my_size-1)
    {
      down = my_rank+1;
    }
  else
    {
      down = my_size-1;
    }

  /*cross-check: presumed my_size is a factor of gridsize else there are odd sized slices and this is not coded for*/
  if (grid_size%my_size != 0)
    {
      printf("ERROR - number of procs =  %d, this is not a factor of grid_size %d\n", my_size, grid_size);
      exit(0);
    }

  /*Set Up Distributed Data Approach*/
  slice = grid_size/my_size;

  yboundary = Linner/grid_res_y; /*y grid index of inner conductor wall*/ 
  xboundary = Rinner/grid_res_x; /*x grid and individual array index of inner conductor wall*/


  double phi[slice+2][grid_size]; /*extra 2 rows to allow for halo data*/

  for (y=0; y < slice+2; y++)
    {
      for (x=0; x < grid_size; x++)
        { 
          phi[y][x] = 0.0;
        }
    }

  if(my_rank == 0) /*Boundary Containing rank does 2 loops. One over part with inner conductor and one over part without inner conductor*/
    {
      for(y=0; y < slice+1; y++)
        {
          for(x=xboundary; x < grid_size; x++)
            {
              phi[y][x] = phi_0;
            }
        }   
    }


  if (my_rank < my_size-1)
    {
      /*send top most strip up one node to be recieved as bottom halo*/
      MPI_Isend(&phi[1][0], grid_size  , MPI_DOUBLE, down, 1, MPI_COMM_WORLD, &sreqU);  
      /*recv top halo from up one node*/
      MPI_Irecv(&phi[slice+1][0], grid_size, MPI_DOUBLE, down, 2, MPI_COMM_WORLD, &reqU);
    }

  if (my_rank > 0)
    {
      /*recv top halo from down one node*/
      MPI_Irecv(&phi[0][0], grid_size , MPI_DOUBLE, up, 2, MPI_COMM_WORLD, &reqD);
      /*send bottom most strip down one node to be recieved as top halo*/
      MPI_Isend(&phi[slice][0], grid_size , MPI_DOUBLE, up, 1, MPI_COMM_WORLD, &sreqD);
    }

  if (my_rank<my_size-1)
    {
      /*Wait for send to down one rank to complete*/
      MPI_Wait(&sreqD, &sDstatus);
      /*Wait for recieve from up one rank to complete*/
      MPI_Wait(&reqD, &rDstatus);
    }

  if (my_rank>0)
    {
      /*Wait for send to up down one rank to complete*/
      MPI_Wait(&sreqU, &sUstatus);
      /*Wait for recieve from down one rank to complete*/
      MPI_Wait(&reqU, &rUstatus);
    }


  MPI_Finalize();

  return 0;
}

我一直在测试 2 个处理器（排名 0 和 1），希望能扩展到更多。

有什么想法可能出在哪里吗？

【问题讨论】：

哇，代码太多了。
你试过在调试器下运行吗？它通常会将您带到发生分段错误的位置，并提供当时程序状态的完整快照。如果由于某种原因不能这样做，您可以设置ulimit -c 以允许核心转储，并改为调试核心文件。
这个错误的地方太多了。但是，您是否更改了 slice 和 grid_size 是合理的。即使在那之后检查有很多可能误入歧途的数组引用。这些行： if (my_rank = A_proc) 不会做你想让他们做的事情
注意minimal reproducible example中的minimal这个词。尝试将您发布的代码集中在您所询问的特定问题上，即使您必须将其分割一些来说明。
@brothir 我在每个等待语句之前添加了打印语句。未打印等待 sreqD 之间的一处。可能是 MPI_Wait 的问题？

标签： c mpi

【解决方案1】：

你在 first MPI_Wait （排名 0）中出错了。这是下面示例代码中的第 7 步。

使用mpirun -np 2 ./whatever：

sReqD 似乎没有设置正确。这是在第 5 步按等级 1 设置的。

但是，第 0 步正在执行第 7 步，这没有设置sReqD。

因此，您需要调整您的 if 语句以正确匹配 MPI_Wait 等的哪个等级。

这是带有一些调试 printf 语句的代码：

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <time.h>
#include <mpi.h>

int
main(int argc, char *argv[])
{
    /* MPI Specific Variables */
    int my_size,
     my_rank,
     up,
     down;
    MPI_Request reqU,
     reqD,
     sreqU,
     sreqD;
    MPI_Status rUstatus,
     rDstatus,
     sUstatus,
     sDstatus;

    /* Physical Dimensions */
    double Linner = 5.0;                /* mm */
    double Rinner = 1.0;                /* mm */
    double phi_0 = 1000.0;

    /*V*/
        /* Other Variables */
    int grid_size = 100;
    int slice;
    int x,
     y;
    double grid_res_y = 0.2;
    double grid_res_x = 0.1;

    int xboundary,
     yboundary;

    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
    MPI_Comm_size(MPI_COMM_WORLD, &my_size);

    /* Determining neighbours */
    /* if statemets used to stop highest and lowest rank neighbours arent
    outside 0 - my_size-1 range of ranks */
    if (my_rank != 0) {
        up = my_rank - 1;
    }
    else {
        up = 0;
    }

    if (my_rank != my_size - 1) {
        down = my_rank + 1;
    }
    else {
        down = my_size - 1;
    }

    printf("my_rank=%d my_size=%d up=%d down=%d\n",my_rank,my_size,up,down);

    /* cross-check: presumed my_size is a factor of gridsize else there are
    odd sized slices and this is not coded for */
    if (grid_size % my_size != 0) {
        printf("ERROR - number of procs =  %d, this is not a factor of grid_size %d\n", my_size, grid_size);
        exit(0);
    }

    /* Set Up Distributed Data Approach */
    slice = grid_size / my_size;

    /* y grid index of inner conductor wall */
    yboundary = Linner / grid_res_y;
    /* x grid and individual array index of inner conductor wall */
    xboundary = Rinner / grid_res_x;

    if (my_rank == 0) {
        printf("Linner=%g grid_res_y=%g yboundary=%d\n",
            Linner,grid_res_y,yboundary);
        printf("Rinner=%g grid_res_x=%g xboundary=%d\n",
            Rinner,grid_res_x,xboundary);
        printf("slice=%d grid_size=%d phi=%ld\n",
            slice,grid_size,sizeof(double) * (slice + 2) * grid_size);
    }

    /* extra 2 rows to allow for halo data */
    double phi[slice + 2][grid_size];

    for (y = 0; y < slice + 2; y++) {
        for (x = 0; x < grid_size; x++) {
            phi[y][x] = 0.0;
        }
    }

    /* Boundary Containing rank does 2 loops. One over part with inner
    conductor and one over part without inner conductor */
    if (my_rank == 0) {
        for (y = 0; y < slice + 1; y++) {
            for (x = xboundary; x < grid_size; x++) {
                phi[y][x] = phi_0;
            }
        }
    }

    if (my_rank < my_size - 1) {
        /* send top most strip up one node to be recieved as bottom halo */
        printf("1: my_rank=%d MPI_Isend\n",my_rank);
        MPI_Isend(&phi[1][0], grid_size, MPI_DOUBLE, down, 1, MPI_COMM_WORLD,
            &sreqU);

        /* recv top halo from up one node */
        printf("2: my_rank=%d MPI_Irecv\n",my_rank);
        MPI_Irecv(&phi[slice + 1][0], grid_size, MPI_DOUBLE, down, 2,
            MPI_COMM_WORLD, &reqU);

        printf("3: my_rank=%d\n",my_rank);
    }

    if (my_rank > 0) {
        /* recv top halo from down one node */
        printf("4: my_rank=%d MPI_Irecv\n",my_rank);
        MPI_Irecv(&phi[0][0], grid_size, MPI_DOUBLE, up, 2, MPI_COMM_WORLD,
            &reqD);

        /* send bottom most strip down one node to be recieved as top halo */
        printf("5: my_rank=%d MPI_Isend\n",my_rank);
        MPI_Isend(&phi[slice][0], grid_size, MPI_DOUBLE, up, 1, MPI_COMM_WORLD,
            &sreqD);

        printf("6: my_rank=%d\n",my_rank);
    }

    if (my_rank < my_size - 1) {
        /* Wait for send to down one rank to complete */
        printf("7: my_rank=%d\n",my_rank);
        MPI_Wait(&sreqD, &sDstatus);
        printf("8: my_rank=%d\n",my_rank);

        /* Wait for recieve from up one rank to complete */
        printf("9: my_rank=%d\n",my_rank);
        MPI_Wait(&reqD, &rDstatus);
        printf("10: my_rank=%d\n",my_rank);
    }

    if (my_rank > 0) {
        /* Wait for send to up down one rank to complete */
        printf("11: my_rank=%d\n",my_rank);
        MPI_Wait(&sreqU, &sUstatus);
        printf("12: my_rank=%d\n",my_rank);

        /* Wait for recieve from down one rank to complete */
        printf("12: my_rank=%d\n",my_rank);
        MPI_Wait(&reqU, &rUstatus);
        printf("13: my_rank=%d\n",my_rank);
    }

    MPI_Finalize();

    return 0;
}

这是输出。请注意，第 7 步打印（这是在第 0 级的第一个 MPI_Wait 之前）。但是，排名 0 永远不会到达第 8 步（printf after 调用）

my_rank=0 my_size=2 up=0 down=1
Linner=5 grid_res_y=0.2 yboundary=25
Rinner=1 grid_res_x=0.1 xboundary=10
slice=50 grid_size=100 phi=41600
1: my_rank=0 MPI_Isend
2: my_rank=0 MPI_Irecv
3: my_rank=0
7: my_rank=0
my_rank=1 my_size=2 up=0 down=1
4: my_rank=1 MPI_Irecv
5: my_rank=1 MPI_Isend
6: my_rank=1
11: my_rank=1
[manderly:230404] *** Process received signal ***
[manderly:230403] *** Process received signal ***
[manderly:230403] Signal: Segmentation fault (11)
[manderly:230403] Signal code: Address not mapped (1)
[manderly:230403] Failing at address: 0x58
[manderly:230404] Signal: Segmentation fault (11)
[manderly:230404] Signal code: Address not mapped (1)
[manderly:230404] Failing at address: 0x58
[manderly:230403] [ 0] [manderly:230404] [ 0] /lib64/libpthread.so.0(+0x121c0)/lib64/libpthread.so.0(+0x121c0)[0x7fa5478341c0]
[0x7fa0ebe951c0]
[manderly:230404] [ 1] [manderly:230403] [ 1] /usr/lib64/openmpi/lib/libmpi.so.20(ompi_request_default_wait+0x31)[0x7fa0ec0e9a81]
[manderly:230404] [ 2] /usr/lib64/openmpi/lib/libmpi.so.20(ompi_request_default_wait+0x31)[0x7fa547a88a81]
[manderly:230403] [ 2] /usr/lib64/openmpi/lib/libmpi.so.20(PMPI_Wait+0x60)[0x7fa0ec12c350]
[manderly:230404] [ 3] ./fix2[0x400f93]
[manderly:230404] [ 4] /usr/lib64/openmpi/lib/libmpi.so.20(PMPI_Wait+0x60)[0x7fa547acb350]
[manderly:230403] [ 3] ./fix2[0x400ef7]
/lib64/libc.so.6(__libc_start_main+0xea)[0x7fa0ebaedfea]
[manderly:230404] [ 5] ./fix2[0x40081a[manderly:230403] [ 4] ]
[manderly:230404] *** End of error message ***
/lib64/libc.so.6(__libc_start_main+0xea)[0x7fa54748cfea]
[manderly:230403] [ 5] ./fix2[0x40081a]
[manderly:230403] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node manderly exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

【讨论】：

作为风格问题，您可以在每次迭代开始时声明一个包含 4 个请求的数组all 设置为 MPI_REQUEST_NULL，然后设置为 MPI_Wait all(4, ...)。或者，您可以将up 和down 设置为MPI_PROC_NULL 在第一个和最后一个行列上，并且始终设置MPI_Isend() 和MPI_Irecv()。