【问题标题】:Python with embedded call to mpirun带有对 mpirun 的嵌入式调用的 Python
【发布时间】:2015-08-03 18:21:58
【问题描述】:

我正在尝试使用 PyOpt 运行一些并行优化。棘手的部分是,在我的目标函数中,我还想使用 mpi 运行 C++ 代码。

我的python脚本如下:

#!/usr/bin/env python    
# Standard Python modules
import os, sys, time, math
import subprocess


# External Python modules
try:
    from mpi4py import MPI
    comm = MPI.COMM_WORLD
    myrank = comm.Get_rank()
except:
    raise ImportError('mpi4py is required for parallelization')

# Extension modules
from pyOpt import Optimization
from pyOpt import ALPSO

# Predefine the BashCommand
RunCprogram = "mpirun -np 2 CProgram" # Parallel C++ program


######################### 
def objfunc(x):

    f = -(((math.sin(2*math.pi*x[0])**3)*math.sin(2*math.pi*x[1]))/((x[0]**3)*(x[0]+x[1])))

    # Run CProgram 
    os.system(RunCprogram) #where the mpirun call occurs

    g = [0.0]*2
    g[0] = x[0]**2 - x[1] + 1
    g[1] = 1 - x[0] + (x[1]-4)**2

    time.sleep(0.01)
    fail = 0
    return f,g, fail

# Instantiate Optimization Problem 
opt_prob = Optimization('Thermal Conductivity Optimization',objfunc)
opt_prob.addVar('x1','c',lower=5.0,upper=1e-6,value=10.0)
opt_prob.addVar('x2','c',lower=5.0,upper=1e-6,value=10.0)
opt_prob.addObj('f')
opt_prob.addCon('g1','i')
opt_prob.addCon('g2','i')

# Solve Problem (DPM-Parallelization)
alpso_dpm = ALPSO(pll_type='DPM')
alpso_dpm.setOption('fileout',0)
alpso_dpm(opt_prob)
print opt_prob.solution(0)

我使用以下代码运行该代码:

mpirun -np 20 python Script.py

但是,我收到以下错误:

[user:28323] *** Process received signal ***
[user:28323] Signal: Segmentation fault (11)
[user:28323] Signal code: Address not mapped (1)
[user:28323] Failing at address: (nil)
[user:28323] [ 0] /lib64/libpthread.so.0() [0x3ccfc0f500]
[user:28323] *** End of error message ***

我认为 2 个不同的 mpirun 调用(一个调用 python 脚本和一个在脚本内)相互冲突。 关于如何解决这个问题的任何线索?

谢谢!!

【问题讨论】:

  • 您是使用 mpi 通信在 python 进程之间交换数据,还是只是使用mpi4py 运行多个隔离实例。如果是这种情况,您可以考虑在 python 中使用subprocess 模块来生成多个线程,每个线程都可以调用一个mpirun 实例(使用subprocess.Popen)。我经常这样做,没有任何问题。如果您在多台机器上运行Script.py,这可能是不可能的......

标签: python c++ mpi


【解决方案1】:

Calling mpi binary in serial as subprocess of mpi application:最安全的方法是使用MPI_Comm_spawn()。以this manager-worker example 为例。

一个快速的解决方法是使用@EdSmith 发出的subprocess.Popen。然而,请注意subprocess.Popen 的默认行为使用父环境。我的猜测是os.system() 也是如此。不幸的是,mpirun 添加了一些环境变量,具体取决于 MPI 实现,例如 OMPI_COMM_WORLD_RANKOMPI_MCA_orte_ess_num_procs。要查看这些环境变量,请在 mpi4py 代码和基本 python shell 中键入 import os ; print os.environ。这些环境变量可能会导致子进程失败。所以我不得不添加一行来摆脱它们......这很脏......归结为:

    args = shlex.split(RunCprogram)
    env=os.environ
    # to remove all environment variables with "MPI" in it...rather dirty...
    new_env = {k: v for k, v in env.iteritems() if "MPI" not in k}

    #print new_env
    # shell=True : watch for security issues...
    p = subprocess.Popen(RunCprogram,shell=True, env=new_env,stdout=subprocess.PIPE, stdin=subprocess.PIPE)
    p.wait()
    result="process myrank "+str(myrank)+" got "+p.stdout.read()
    print result

完整的测试代码,由mpirun -np 2 python opti.py 运行:

#!/usr/bin/env python    
# Standard Python modules
import os, sys, time, math
import subprocess
import shlex


# External Python modules
try:
    from mpi4py import MPI
    comm = MPI.COMM_WORLD
    myrank = comm.Get_rank()
except:
    raise ImportError('mpi4py is required for parallelization')

# Predefine the BashCommand
RunCprogram = "mpirun -np 2 main" # Parallel C++ program


######################### 
def objfunc(x):

    f = -(((math.sin(2*math.pi*x[0])**3)*math.sin(2*math.pi*x[1]))/((x[0]**3)*(x[0]+x[1])))

    # Run CProgram 
    #os.system(RunCprogram) #where the mpirun call occurs
    args = shlex.split(RunCprogram)
    env=os.environ
    new_env = {k: v for k, v in env.iteritems() if "MPI" not in k}

    #print new_env
    p = subprocess.Popen(RunCprogram,shell=True, env=new_env,stdout=subprocess.PIPE, stdin=subprocess.PIPE)
    p.wait()
    result="process myrank "+str(myrank)+" got "+p.stdout.read()
    print result



    g = [0.0]*2
    g[0] = x[0]**2 - x[1] + 1
    g[1] = 1 - x[0] + (x[1]-4)**2

    time.sleep(0.01)
    fail = 0
    return f,g, fail

print objfunc([1.0,0.0])

基础worker,由mpiCC main.cpp -o main编译:

#include "mpi.h"

int main(int argc, char* argv[]) { 
    int rank, size;

    MPI_Init (&argc, &argv);    
    MPI_Comm_rank (MPI_COMM_WORLD, &rank);  
    MPI_Comm_size (MPI_COMM_WORLD, &size);  

    if(rank==0){
        std::cout<<" size "<<size<<std::endl;
    }
    MPI_Finalize();

    return 0;

}

【讨论】:

    猜你喜欢
    • 2017-07-28
    • 1970-01-01
    • 2016-03-27
    • 2021-12-28
    • 1970-01-01
    • 2019-11-05
    • 2019-02-05
    • 2020-08-24
    • 2022-07-28
    相关资源
    最近更新 更多