使用ray并行化模拟器python答案

【问题标题】：Using ray to parallelize simulator python使用ray并行化模拟器python
【发布时间】：2020-11-08 13:02:09
【问题描述】：

我是 ray 新手，我正在尝试并行化我开发的模拟器。这是我的模拟器的一个例子，显然它更复杂。

import some_library
import sim_library_with_global_object

class Model(object):
    def __init__(self,init_vals):
        #initialize object using some of the global_object from sim_library.
        #the Model object have it's own variables not global

    def do_step(self,time):
        #calculate Model step using the global_object from sim_library
        #edit the Model variables with respect to the step


class ManyModel(object):
    def init(self):
        self.models=[]

    def add_model(self,init_vals):
        model = Model(init_vals)
        self.model.append(model)

    def step(self,time):
        for model in self.models:
            model.do_step(time)

    def get_data_step(self):
        data=[]
        for model in self.models:
            data.append(model.myvalues)
        return data



sim=ManyModel()
inits=[] #####list of init_vals
times=[] ####list of times to simulate
for init in intis:
    sim.add_model(init)

for time in times:
    sim.step(time)
    step_data=sim.get_data_step()

到目前为止，我已经尝试通过以下两种方式在 Model 类 (1) 和 ManyModel 类 (2) 上使用带有装饰器 @ray.remote 的 ray：

(1)

############################## (1) ###############
import some_library
import sim_library_with_global_object

@ray.remote
class Model(object):
    def __init__(self,init_vals):
        #initialize object using some of the global_object from sim_library.
        #the Model object have it's own variables not global

    def do_step(self,time):
        #calculate Model step using the global_object from sim_library
        #edit the Model variables with respect to the step


class ManyModel(object):
    def init(self):
        self.models=[]


    def add_model(self,init_vals):
        model = Model.remote(init_vals)
        self.model.append(model)

    def step(self,time):
        futures=[]
        for model in self.models:
            futures.append(model.do_step.remote(time))
        return futures

    def get_data_step(self,futures):
        data=[]
        while len(futures)>0:
            ready, not_ready = ray.wait(ids)
            results=ray.get(ready)
            data.append(results)
        return data

ray.init()
sim=ManyModel()
inits=[] #####list of init_vals
times=[] ####list of times to simulate
for init in intis:
    sim.add_model(init)

for time in times:
    sim.step(time)
    step_data=sim.get_data_step()

和(2)

########################## (2) #################

import some_library
import sim_library_with_global_object

class Model(object):
    def __init__(self,init_vals):
        #initialize object using some of the global_object from sim_library.
        #the Model object have it's own variables not global

    def do_step(self,time):
        #calculate Model step using the global_object from sim_library
        #edit the Model variables with respect to the step

@ray.remote
class ManyModel(object):
    def init(self):
        self.models=[]
        self.data=[]


    def add_model(self,init_vals):
        model = Model(init_vals)
        self.model.append(model)

    def step(self,time):
        for model in self.models:
            model.do_step(time)

    def get_data_step(self):
        self.data=[]
        for model in self.models:
            self.data.append(model.myvalues)
        return self.data


ray.init()
sim=ManyModel.remote()
inits=[] #####list of init_vals
times=[] ####list of times to simulate
for init in intis:
    sim.add_model.remote(init)

for time in times:
    sim.step.remote(time)
    future=sim.get_data_step.remote()
    step_data=ray.get(future)

在这两种方式中，我都没有从使用 ray 库中获得任何好处。可以帮我使用吗？

方法（一）的更新 第一种方法的问题是我收到了这个警告信息

2020-11-09 11:33:20,517 WARNING worker.py:1779 -- WARNING: 12 PYTHON workers have been started. This could be a result of using a large number of actors, or it could be a consequence of using nested tasks (see https://github.com/ray-project/ray/issues/3644) for some a discussion of workarounds.

使用 10 x Model 这是性能结果：不使用射线： 10 x Model -> do_step 0.11 [s] 使用射线 (1)： 10 x Model -> do_step 0.22 [s]

此外，每次我使用方法 (1) 创建一个 Actor 时，它都会复制导入库的所有 global_objects 并且 ram 消耗变得疯狂。我需要用超过 100k+ Model objects 进行午餐模拟。

总的来说，我不明白在 ray 中创建多个演员是否是个好主意。

【问题讨论】：

标签： python simulator ray concurrent-processing

【解决方案1】：

放大一些核心元素

ray.init()
sim=ManyModel.remote()

for time in times:
    sim.step.remote(time)
    future=sim.get_data_step.remote()
    step_data=ray.get(future)

最重要的一点是，您只创建了一个 Ray Actor（在sim=ManyModel.remote() 行中）。 Ray Actor 按顺序执行提交给它们的任务（默认情况下），因此创建一个 Actor 不会为并行性创造任何机会。要获得与 Ray Actor 的并行性，您需要创建和使用多个 Actor。

第二点是您在 for 循环内调用 ray.get。这意味着在 for 循环的每次迭代中，您都在提交一个任务，然后调用 ray.get 等待它完成并检索结果。相反，您需要提交多个任务（可能在循环内），然后在循环外调用 ray.get。

【讨论】：

我的意图是检索 ManyModel 中所有模型的每个时间步的结果。我已经尝试了创建许多演员的配置 (1) 和 (2)。但是没有任何牵引，我得到了任何性能提升。