如何使用多线程优化人脸检测？答案

【问题标题】：how to use multi-threading for optimizing face detection?如何使用多线程优化人脸检测？
【发布时间】：2018-07-30 14:42:36
【问题描述】：

我有一个代码，它使用 CSV 文件中的图像 URL 列表，然后对这些图像执行面部检测，然后加载一些模型并对这些图像进行预测。

我做了一些负载测试，发现代码中的 get_face 函数占用了生成结果所需的大部分时间，而额外的时间由为预测创建的 pickle 文件占用。

问题：是否有可能通过在线程中运行这些进程来减少时间，以及如何以多线程方式完成？

下面是代码示例：

from __future__ import division
import numpy as np

from multiprocessing import Process, Queue, Pool
import os
import pickle
import pandas as pd
import dlib
from skimage import io
from skimage.transform import resize

df = pd.read_csv('/home/instaurls.csv')
detector = dlib.get_frontal_face_detector()
img_width, img_height = 139, 139
confidence = 0.8

def get_face():
    output = None
    data1 = []
    for row in df.itertuples():
        img = io.imread(row[1])
        dets = detector(img, 1)
        for i, d in enumerate(dets):
            img = img[d.top():d.bottom(), d.left():d.right()]
            img = resize(img, (img_width, img_height))
            output = np.expand_dims(img, axis=0)
            break
        data1.append(output)
    data1 = np.concatenate(data1)
    return data1

get_face()

csv 样本

data
https://scontent-frt3-2.cdninstagram.com/t51.2885-19/s320x320/23101834_1502115223199537_1230866541029883904_n.jpg
https://scontent-frt3-2.cdninstagram.com/t51.2885-19/s320x320/17883193_940000882769400_8455736118338387968_a.jpg
https://scontent-frt3-2.cdninstagram.com/t51.2885-19/s320x320/22427207_1737576603205281_7879421442167668736_n.jpg
https://scontent-frt3-2.cdninstagram.com/t51.2885-19/s320x320/12976287_1720757518213286_1180118177_a.jpg
https://scontent-frt3-2.cdninstagram.com/t51.2885-19/s320x320/23101834_1502115223199537_1230866541029883904_n.jpg
https://scontent-frx5-1.cdninstagram.com/t51.2885-19/s320x320/16788491_748497378632253_566270225134125056_a.jpg
https://scontent-frx5-1.cdninstagram.com/t51.2885-19/s320x320/21819738_128551217878233_9151523109507956736_n.jpg
https://scontent-frx5-1.cdninstagram.com/t51.2885-19/s320x320/14295447_318848895135407_524281974_a.jpg
https://scontent-frx5-1.cdninstagram.com/t51.2885-19/s320x320/18160229_445050155844926_2783054824017494016_a.jpg
https://scontent-frt3-2.cdninstagram.com/t51.2885-19/s320x320/23101834_1502115223199537_1230866541029883904_n.jpg
https://scontent-frt3-2.cdninstagram.com/t51.2885-19/s320x320/17883193_940000882769400_8455736118338387968_a.jpg
https://scontent-frt3-2.cdninstagram.com/t51.2885-19/s320x320/22427207_1737576603205281_7879421442167668736_n.jpg
https://scontent-frt3-2.cdninstagram.com/t51.2885-19/s320x320/12976287_1720757518213286_1180118177_a.jpg
https://scontent-frt3-2.cdninstagram.com/t51.2885-19/s320x320/23101834_1502115223199537_1230866541029883904_n.jpg
https://scontent-frx5-1.cdninstagram.com/t51.2885-19/s320x320/16788491_748497378632253_566270225134125056_a.jpg
https://scontent-frx5-1.cdninstagram.com/t51.2885-19/s320x320/21819738_128551217878233_9151523109507956736_n.jpg
https://scontent-frx5-1.cdninstagram.com/t51.2885-19/s320x320/14295447_318848895135407_524281974_a.jpg
https://scontent-frx5-1.cdninstagram.com/t51.2885-19/s320x320/18160229_445050155844926_2783054824017494016_a.jpg
https://scontent-frt3-2.cdninstagram.com/t51.2885-19/s320x320/23101834_1502115223199537_1230866541029883904_n.jpg

【问题讨论】：

您不应该使用 pandas 来读取 url 的 csv ......更不用说将图像存储回来。 Pandas 不是数据库。正如您现在所拥有的那样，（在我看来）不可能进行任何多线程处理，因为 get_face 函数会将所有内容加载到数据帧中。
从表面上看，get_face 受 CPU 限制，在这种情况下使用 Python 线程并没有多大用处。您应该专注于multiprocessing 模块并创建进程池。这样您就可以使用多个 CPU 内核。
@IgnacioVergaraKausel 你有什么建议？我使用 pandas 来避免遍历 csv 行的 for 循环，并且在函数中我将所有内容加载到 numpy 数组中；不是数据框
@Rehan 它可能会慢一点，前提是 pandas 更擅长阅读 csv ......但我认为这不是你的瓶颈。您确实说过函数get_face 是最常使用的函数，而该函数是您应该尝试并行化的函数。实际上，该函数应该被称为get_faces，因为它会处理所有这些。你应该制作一个真正的get_face，它只接受一个图像作为参数。这是您并行并使用队列来控制从 url reader 到 taks 的流程的第一步。
@Rehan 我不能给你具体的说明，因为你有很多函数调用，我不知道它们是如何工作的。我什至无法运行此代码，因为它不是MCVE。

标签： python multithreading python-multithreading

【解决方案1】：

以下是您可以尝试并行执行的方法：

from __future__ import division
import numpy as np

from multiprocessing import Process, Queue, Pool
import os
import pickle
import pandas as pd
import dlib
from skimage import io
from skimage.transform import resize
from csv import DictReader

df = DictReader(open('/home/instaurls.csv')) # DictReader is iterable
detector = dlib.get_frontal_face_detector() 
img_width, img_height = 139, 139
confidence = 0.8

def get_face(row):
    """
    Here row is dictionary where keys are CSV header names
    and values are values from current CSV row.
    """
    output = None

    img = io.imread(row[1]) # row[1] has to be changed to row['data']?
    dets = detector(img, 1)
    for i, d in enumerate(dets):
        img = img[d.top():d.bottom(), d.left():d.right()]
        img = resize(img, (img_width, img_height))
        output = np.expand_dims(img, axis=0)
        break

    return output

if __name__ == '__main__':
    pool = Pool() # default to number CPU cores
    data = list(pool.imap(get_face, df))
    print np.concatenate(data)

注意get_face 和它的论点。此外，它返回的内容。这就是我所说的小块工作的意思。现在get_face 处理 CSV 中的一行。

当您运行此脚本时，pool 将成为对 Pool 实例的引用，然后您为 df.itertuples() 中的每一行/元组调用 get_face。

一切都完成后，data 保存处理数据，然后您对其执行np.concatenate。

【讨论】：

我猜，因为对get_face 的每次调用都是独立的，所以使用线程而不是多处理会是一个更好的主意。可能使用concurrent.futures，它允许在进程和线程之间轻松交换在这里也很有意义。
我认为这个函数是 CPU 密集型的，即 CPU 密集型。在这种情况下，由于 GIL，您在任何时候都只会运行一个线程。由于 GIL 和多线程之间的交换会引入额外的开销，因此与完全没有任何并行化的情况相比，您将获得更差的性能。如果get_face 是 I/O 绑定的，情况会有所不同。然后你会在线程中获得更好的性能，因为它们中的每一个都将主要等待 I/O。
@ikac 非常感谢您提供解决方案。我集成了代码，但我仍然有一些问题。 concatenate 语句返回零维数组不能连接的错误。我尝试在 imap 函数下方只打印数据对象而没有任何连接，它只返回 imap 迭代器的一个对象，尽管由于递归调用它应该返回多个对象
糟糕，误入了一个错误。确保更新代码，我已经修改了解决方案。提示：data = list(pool....) 而不是data = [pool...]。原因是 imap 返回 iterator 并且它必须“展开”。
@ikac 现在它给了我同一行的奇怪酸洗错误。这个错误_pickle.PicklingError: Can't pickle <class 'pandas.core.frame.Pandas'>: attribute lookup Pandas on pandas.core.frame failed