【发布时间】:2020-06-30 13:59:41
【问题描述】:
我正在编写一个生成数据的蒙特卡洛代码。更具体地说,它生成的数据可以落入num_data_domains 单独的数据域之一。最后,每个域都应该包含min_sample_size 数据点。以下是非并行代码的样子:
int num_data_domains = 10;
std::vector<unsigned long int> counters(num_data_domains, 0);
std::vector<std::vector<double>> data_sets(num_data_domains);
unsigned int min_sample_size = 100;
unsigned int smallest_sample_size = 0;
while(smallest_sample_size < min_sample_size)
{
double data_point = Generate_Data_Point();
int data_domain = Identify_Data_Domain(data_point); // returns a number between 0 and data_domains-1
data_sets[data_domain].push_back(data_point);
counters[data_domain]++;
smallest_sample_size = *std::min_element(std::begin(counters), std::end(counters));
}
根据我对previous question 的回答,我想使用 RMA 函数将这个过程与 MPI 并行化。但我无法让它工作。
这是我上面代码的并行版本。
int num_data_domains = 10;
std::vector<unsigned long int> counters(num_data_domains, 0);
std::vector<std::vector<double>> data_set(num_data_domains);
MPI_Win mpi_window;
MPI_Win_create(&counters, num_data_domains * sizeof(unsigned long int), sizeof(unsigned long int), MPI_INFO_NULL, MPI_COMM_WORLD, &mpi_window);
int mpi_target_rank = 0;
unsigned long int increment = 1;
unsigned int min_sample_size = 100;
unsigned int smallest_sample_size = 0;
while(smallest_sample_size < min_sample_size)
{
double data_point = Generate_Data_Point();
int data_domain = Identify_Data_Domain(data_point); // returns a number between 0 and data_domains-1
data_sets[data_domain].push_back(data_point);
MPI_Win_lock(MPI_LOCK_EXCLUSIVE, 0, 0, mpi_window);
MPI_Accumulate(&increment, 1, MPI_UNSIGNED_LONG, mpi_target_rank, data_domain, 1, MPI_UNSIGNED_LONG, MPI_SUM, mpi_window);
MPI_Win_unlock(0, mpi_window);
MPI_Win_lock(MPI_LOCK_EXCLUSIVE, 0, 0, mpi_window);
MPI_Get( &counters , num_data_domains , MPI_UNSIGNED_LONG , mpi_target_rank , 0 , num_data_domains , MPI_UNSIGNED_LONG , mpi_window);
MPI_Win_unlock(0, mpi_window);
smallest_sample_size = *std::min_element(std::begin(counters), std::end(counters));
}
MPI_Win_free(&mpi_window);
在这里,MPI 进程 0(“主”)的 counters 应该通过 MPI_Accumulate() 更新。这里,第五个参数data_domain 应该是目标缓冲区中的位移,即这应该确保正确的域计数器递增。之后,每个工作人员将自己的计数器更新为远程计数器。
但是,如果我这样设置代码,则会出现分段错误:
[MacBook-Pro:84733] *** Process received signal ***
[MacBook-Pro:84733] Signal: Segmentation fault: 11 (11)
[MacBook-Pro:84733] Signal code: Address not mapped (1)
[MacBook-Pro:84733] Failing at address: 0x9
[MacBook-Pro:84733] [ 0] 0 libsystem_platform.dylib 0x00007fff6fde65fd _sigtramp + 29
[MacBook-Pro:84733] [ 1] 0 ??? 0x0000000000000000 0x0 + 0
[MacBook-Pro:84733] [ 2] 0 executable 0x000000010e53c14b main + 1083
[MacBook-Pro:84733] [ 3] 0 libdyld.dylib 0x00007fff6fbedcc9 start + 1
[MacBook-Pro:84733] [ 4] 0 ??? 0x0000000000000002 0x0 + 2
[MacBook-Pro:84733] *** End of error message ***
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node MacBook-Pro exited on signal 11 (Segmentation fault: 11).
--------------------------------------------------------------------------
我相当肯定MPI_Accumulate() 会导致此错误。我做错了什么?
【问题讨论】:
-
域是否允许有多个
min_sample_size条目?我猜是的,因为没有条件阻止它..? -
绝对。所有域都填充数据,直到最小的数据域满足
min_sample_size条件。
标签: c++ parallel-processing mpi openmpi