在 C 中使用 MPI 收集拆分二维数组答案

【问题标题】：Gather a split 2D array with MPI in C在 C 中使用 MPI 收集拆分二维数组
【发布时间】：2021-01-07 18:30:30
【问题描述】：

我需要将一段很长的代码的这一部分改编成 c 中的 mpi。

for (i = 0; i < total; i++) {
   sum = A[next][0][0]*B[i][0] + A[next][0][1]*B[i][1] + A[next][0][2]*B[i][2];
   next++;
   while (next < last) {
      col = column[next];
      sum += A[next][0][0]*B[col][0] + A[next][0][1]*B[col][1] + A[next][0][2]*B[col][2];
      final[col][0] += A[next][0][0]*B[i][0] + A[next][1][0]*B[i][1] + A[next][2][0]*B[i][2];
      next++;
}
final[i][0] += sum;}

我正在考虑这样的代码：

for (i = 0; i < num_threads; i++) {
   for (j = 0; j < total; j++) {
      check_thread[i][j] = false;
   }
}
part = total / num_threads;
for (i = thread_id * part; i < ((thread_id + 1) * part); i++) {
   sum = A[next][0][0]*B[i][0] + A[next][0][1]*B[i][1] + A[next][0][2]*B[i][2];
   next++;
   while (next < last) {
     col = column[next];
     sum += A[next][0][0]*B[col][0] + A[next][0][1]*B[col][1] + A[next][0][2]*B[col][2];
     if (!check_thread[thread_id][col]) {
        check_thread[thread_id][col] = true;
        temp[thread_id][col] = 0.0;
     }      
     temp[thread_id][col] += A[next][0][0]*B[i][0] + A[next][1][0]*B[i][1] + A[next][2][0]*B[i][2];
     next++;
   }
   if (!check_thread[thread_id][i]) {
      check_thread[thread_id][i] = true;
      temp[thread_id][i] = 0.0;
   }
 temp[thread_id][i] += sum;
}
*
for (i = 0; i < total; i++) {
   for (j = 0; j < num_threads; j++) {
     if (check_thread[j][i]) {
        final[i][0] += temp[j][i];
     }
   }
}

然后我需要将所有临时部分集中在一起，我在想MPI_Allgather 以及在最后两个 for (where *) 之前类似的东西：

  MPI_Allgather(temp, (part*sizeof(double)), MPI_DOUBLE, temp, sizeof(**temp), MPI_DOUBLE, MPI_COMM_WORLD);

但是我得到一个执行错误，是否可以在同一个变量中发送和接收？如果不是，在这种情况下还有什么其他解决方案？

【问题讨论】：

标签： c performance parallel-processing mpi openmpi

【解决方案1】：

您使用错误的参数调用 MPI_Allgather：

 MPI_Allgather(temp, (part*sizeof(double)), MPI_DOUBLE, temp, sizeof(**temp), MPI_DOUBLE, MPI_COMM_WORLD);

相反，您应该拥有 (source)：

MPI_Allgather

从所有任务中收集数据并将合并后的数据分发给所有任务任务

输入参数
sendbuf 发送缓冲区的起始地址（选择）
sendcount 发送缓冲区中的元素数（整数）
sendtype 发送缓冲区元素的数据类型（句柄）
recvcount 从任何进程接收的元素数（整数）
recvtype 接收缓冲区元素的数据类型（句柄）
通讯器（句柄）

您的 sendcount 和 recvcount 参数都错误，而不是 (part*sizeof(double)) 和 sizeof(**temp) 您应该传递矩阵 temp 中将被收集的元素数量由所有涉及的过程。

如果该矩阵在内存中连续分配，则该矩阵可以在一次调用中收集，如果它是作为指针数组创建的，那么您必须为矩阵的每一行调用MPI_Allgather，或使用@改为 987654322@。

是否可以在同一个变量中发送和接收？

是的，使用In-place Option

当通讯器是内部通讯器时，您可以执行 all-gather 就地操作（输出缓冲区用作输入缓冲）。 使用变量 MPI_IN_PLACE 作为 sendbuf 的值。 在在这种情况下，sendcount 和 sendtype 将被忽略。每个输入数据假定进程位于该进程将接收的区域中它自己对接收缓冲区的贡献。具体来说，结果使用就地选项对 MPI_Allgather 的调用是相同的对于所有进程都执行 n 次调用的情况

MPI_GATHER (MPI_IN_PLACE, 0, MPI_DATATYPE_NULL, recvbuf, recvcount, recvtype, root, comm)

【讨论】：