MPI Bcast 或 Scatter 到特定等级答案

【问题标题】：MPI Bcast or Scatter to specific ranksMPI Bcast 或 Scatter 到特定等级
【发布时间】：2015-09-03 22:37:49
【问题描述】：

我有一些数据。我试图做的是这样的：

使用等级 0 将数据广播到 50 个节点。每个节点上都有 1 个 mpi 进程，该进程有 16 个内核可用。然后，每个mpi进程都会调用python multiprocessing。完成了一些计算，然后 mpi 进程保存使用多处理计算的数据。 mpi 进程然后更改一些变量，并再次运行多处理。等等。

因此，除了初始启动它们都接收一些数据之外，节点不需要相互通信。

多重处理效果不佳。所以现在我想使用所有的 MPI。

我如何（或者不可能）使用一个整数数组来引用 MPI 等级的 bcast 或 scatter。例如，排名 1-1000，节点有 12 个核心。所以每排到第 12 位我想广播数据。然后在每 12 级，我希望它将数据分散到第 12+1 到 12+12 级。

这需要第一次bcast与totalrank/12通信，然后每个rank会负责向同一个节点的rank发送数据，然后收集结果，保存，然后再发送更多的数据到同一个节点的rank。

【问题讨论】：

您可以使用 MPI_Comm_split_type 为每个节点获取一个通信器。然后计算每个的所有等级 0 并为它们创建一个通信。然后在该通讯上分散/广播。
@Jeff 您是否有指向该 python 文档的链接？我在上面找不到任何东西，尝试使用可用于 c++ 的组合说它不存在
您需要 C 接口，因为它是 MPI-3。我假设 mpi4py 支持它。 MPI 没有官方的 Python 接口，但这是迄今为止最流行的接口。 mpich.org/static/docs/v3.1/www3/MPI_Comm_split_type.html

标签： python mpi mpi4py

【解决方案1】：

我对 mpi4py 的了解还不够，无法为您提供代码示例，但这里有可能是 C++ 中的解决方案。我相信您可以轻松地从中推断出 Python 代码。

#include <mpi.h>
#include <iostream>
#include <cstdlib> /// for abs
#include <zlib.h>  /// for crc32

using namespace std;

int main( int argc, char *argv[] ) {

    MPI_Init( &argc, &argv );
    // get size and rank
    int rank, size;
    MPI_Comm_rank( MPI_COMM_WORLD, &rank );
    MPI_Comm_size( MPI_COMM_WORLD, &size );

    // get the compute node name
    char name[MPI_MAX_PROCESSOR_NAME];
    int len;
    MPI_Get_processor_name( name, &len );

    // get an unique positive int from each node names
    // using crc32 from zlib (just a possible solution)
    uLong crc = crc32( 0L, Z_NULL, 0 );
    int color = crc32( crc, ( const unsigned char* )name, len );
    color = abs( color );

    // split the communicator into processes of the same node
    MPI_Comm nodeComm;
    MPI_Comm_split( MPI_COMM_WORLD, color, rank, &nodeComm );

    // get the rank on the node
    int nodeRank;
    MPI_Comm_rank( nodeComm, &nodeRank );

    // create comms of processes of the same local ranks
    MPI_Comm peersComm;
    MPI_Comm_split( MPI_COMM_WORLD, nodeRank, rank, &peersComm );

    // now, masters are all the processes of nodeRank 0
    // they can communicate among them with the peersComm
    // and with their local slaves with the nodeComm
    int worktoDo = 0;
    if ( rank == 0 ) worktoDo = 1000;
    cout << "Initially [" << rank << "] on node "
         << name << " has " << worktoDo << endl;
    MPI_Bcast( &worktoDo, 1, MPI_INT, 0, peersComm );
    cout << "After first Bcast [" << rank << "] on node "
         << name << " has " << worktoDo << endl;
    if ( nodeRank == 0 ) worktoDo += rank;
    MPI_Bcast( &worktoDo, 1, MPI_INT, 0, nodeComm );
    cout << "After second Bcast [" << rank << "] on node "
         << name << " has " << worktoDo << endl;

    // cleaning up
    MPI_Comm_free( &peersComm );
    MPI_Comm_free( &nodeComm );

    MPI_Finalize();
    return 0;
}

如您所见，您首先在同一节点上创建具有进程的通信器。然后，您在每个节点上创建具有相同本地等级的所有进程的对等通信器。从那时起，您的全局 0 级主进程将向本地主进程发送数据。并且他们会在他们负责的节点上分发工作。

【讨论】：