【发布时间】:2017-09-25 02:27:47
【问题描述】:
我有一些 Python 代码使用multiprocessing.pool.Pool.imap_unordered 在受 CPU 限制的情况下并行创建一堆临时文件。然后我从生成的迭代器中读取文件名,在第二个磁盘绑定步骤中处理每个文件名,然后删除它们。通常磁盘绑定步骤是两者中较快的一个,因此每个临时文件在创建下一个之前都会被处理和删除。但是,当在网络文件系统上运行时,磁盘绑定步骤可能会变成慢速步骤,在这种情况下,并行运行的 CPU 绑定步骤开始生成临时文件的速度比磁盘绑定步骤处理和删除它们的速度要快,所以大量的临时文件开始积累。为了避免这个问题,如果并行迭代比消费者提前 10 个以上,我希望并行迭代暂停。 multiprocessing.pool.Pool.imap_unordered 有什么替代品可以做到这一点吗?
这是一些模拟问题的示例代码:
import os
from time import sleep
from multiprocessing.pool import Pool
input_values = list(range(10))
def fast_step(x):
print("Running fast step for {x}".format(x=x))
return x
def slow_step(x):
print("Starting slow step for {x}".format(x=x))
sleep(1)
print("Finishing slow step for {x}".format(x=x))
return x
mypool = Pool(2)
step1_results = mypool.imap(fast_step, input_values)
for i in step1_results:
slow_step(i)
运行它会产生类似的结果:
$ python temp.py
Running fast step for 0
Running fast step for 1
Running fast step for 2
Running fast step for 3
Running fast step for 4
Starting slow step for 0
Running fast step for 5
Running fast step for 6
Running fast step for 7
Running fast step for 8
Running fast step for 9
Finishing slow step for 0
Starting slow step for 1
Finishing slow step for 1
Starting slow step for 2
Finishing slow step for 2
Starting slow step for 3
Finishing slow step for 3
Starting slow step for 4
Finishing slow step for 4
Starting slow step for 5
Finishing slow step for 5
Starting slow step for 6
Finishing slow step for 6
Starting slow step for 7
Finishing slow step for 7
Starting slow step for 8
Finishing slow step for 8
Starting slow step for 9
Finishing slow step for 9
【问题讨论】:
标签: python python-multiprocessing