【问题标题】:What batch_size and pre_dispatch in joblib exactly meanjoblib 中的 batch_size 和 pre_dispatch 到底是什么意思
【发布时间】:2016-02-16 07:54:26
【问题描述】:

来自这里的文档https://pythonhosted.org/joblib/parallel.html#parallel-reference-documentation 我不清楚 batch_sizepre_dispatch 到底是什么意思。

让我们考虑使用'multiprocessing' 后端、2 个作业(2 个进程)并且我们有 10 个任务要计算的情况。

据我了解:

batch_size - 控制一次腌制任务的数量,所以如果你设置batch_size = 5 - joblib 将腌制并立即向每个进程发送5个任务,到达那里后将由进程依次解决,一个接一个其他。使用batch_size=1,joblib 将一次腌制并发送一个任务,当且仅当该进程完成了之前的任务。

表达我的意思:

def solve_one_task(task):
    # Solves one task at a time
    ....
    return result

def solve_list(list_of_tasks):
    # Solves batch of tasks sequentially
    return [solve_one_task(task) for task in list_of_tasks]

所以这段代码:

Parallel(n_jobs=2, backend = 'multiprocessing', batch_size=5)(
        delayed(solve_one_task)(task) for task in tasks)

等于这个代码(在性能上):

slices = [(0,5)(5,10)]
Parallel(n_jobs=2, backend = 'multiprocessing', batch_size=1)(
        delayed(solve_list)(tasks[slice[0]:slice[1]]) for slice in slices)

我说的对吗?那么pre_dispatch 是什么意思呢?

【问题讨论】:

    标签: python multithreading python-3.x multiprocessing joblib


    【解决方案1】:

    事实证明,我是对的,并且两段代码在性能方面非常相似,所以batch_size 的工作方式与我在问题中的预期一样。 pre_dispatch(如文档所述)控制任务队列中实例化任务的数量。

    from sklearn.externals.joblib import Parallel, delayed
    from time import sleep, time
    
    def solve_one_task(task):
        # Solves one task at a time
        print("%d. Task #%d is being solved"%(time(), task))
        sleep(5)
        return task
    
    def task_gen(max_task):
        current_task = 0
        while current_task < max_task:
            print("%d. Task #%d was dispatched"%(time(), current_task))
            yield current_task
            current_task += 1
    
    Parallel(n_jobs=2, backend = 'multiprocessing', batch_size=1, pre_dispatch=3)(
            delayed(solve_one_task)(task) for task in task_gen(10))
    

    输出:

    1450105367. Task #0 was dispatched
    1450105367. Task #1 was dispatched
    1450105367. Task #2 was dispatched
    1450105367. Task #0 is being solved
    1450105367. Task #1 is being solved
    1450105372. Task #2 is being solved
    1450105372. Task #3 was dispatched
    1450105372. Task #4 was dispatched
    1450105372. Task #3 is being solved
    1450105377. Task #4 is being solved
    1450105377. Task #5 was dispatched
    1450105377. Task #5 is being solved
    1450105377. Task #6 was dispatched
    1450105382. Task #7 was dispatched
    1450105382. Task #6 is being solved
    1450105382. Task #7 is being solved
    1450105382. Task #8 was dispatched
    1450105387. Task #9 was dispatched
    1450105387. Task #8 is being solved
    1450105387. Task #9 is being solved
    Out[1]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
    

    【讨论】:

      猜你喜欢
      • 2022-07-08
      • 2011-02-15
      • 2017-08-07
      • 2017-07-20
      • 2014-09-23
      • 2014-07-25
      • 2012-09-17
      • 1970-01-01
      相关资源
      最近更新 更多