Python多处理：同步类文件对象答案

【问题标题】：Python multiprocessing: synchronizing file-like objectPython多处理：同步类文件对象
【发布时间】：2011-08-14 21:06:36
【问题描述】：

我正在尝试制作一个类似对象的文件，该对象旨在在测试期间分配给 sys.stdout/sys.stderr 以提供确定性输出。它并不意味着快速，只是可靠。到目前为止我所拥有的几乎工作，但我需要一些帮助来摆脱最后几个极端情况错误。

这是我当前的实现。

try:
    from cStringIO import StringIO
except ImportError:
    from StringIO import StringIO

from os import getpid
class MultiProcessFile(object):
    """
    helper for testing multiprocessing

    multiprocessing poses a problem for doctests, since the strategy
    of replacing sys.stdout/stderr with file-like objects then
    inspecting the results won't work: the child processes will
    write to the objects, but the data will not be reflected
    in the parent doctest-ing process.

    The solution is to create file-like objects which will interact with
    multiprocessing in a more desirable way.

    All processes can write to this object, but only the creator can read.
    This allows the testing system to see a unified picture of I/O.
    """
    def __init__(self):
        # per advice at:
        #    http://docs.python.org/library/multiprocessing.html#all-platforms
        from multiprocessing import Queue
        self.__master = getpid()
        self.__queue = Queue()
        self.__buffer = StringIO()
        self.softspace = 0

    def buffer(self):
        if getpid() != self.__master:
            return

        from Queue import Empty
        from collections import defaultdict
        cache = defaultdict(str)
        while True:
            try:
                pid, data = self.__queue.get_nowait()
            except Empty:
                break
            cache[pid] += data
        for pid in sorted(cache):
            self.__buffer.write( '%s wrote: %r\n' % (pid, cache[pid]) )
    def write(self, data):
        self.__queue.put((getpid(), data))
    def __iter__(self):
        "getattr doesn't work for iter()"
        self.buffer()
        return self.__buffer
    def getvalue(self):
        self.buffer()
        return self.__buffer.getvalue()
    def flush(self):
        "meaningless"
        pass

...和一个快速测试脚本：

#!/usr/bin/python2.6

from multiprocessing import Process
from mpfile import MultiProcessFile

def printer(msg):
    print msg

processes = []
for i in range(20):
    processes.append( Process(target=printer, args=(i,), name='printer') )

print 'START'
import sys
buffer = MultiProcessFile()
sys.stdout = buffer

for p in processes:
    p.start()
for p in processes:
    p.join()

for i in range(20):
    print i,
print

sys.stdout = sys.__stdout__
sys.stderr = sys.__stderr__
print 
print 'DONE'
print
buffer.buffer()
print buffer.getvalue()

这在 95% 的情况下都能完美运行，但它存在三个极端情况问题。我必须在一个快速的 while 循环中运行测试脚本来重现这些。

3% 的时间，父进程的输出没有完全反映。我认为这是因为数据在队列刷新线程赶上之前就被消耗掉了。我还没有办法在不死锁的情况下等待线程。
.5% 的时间，有来自 multiprocess.Queue 实现的回溯
.01% 的情况下，PID 会回绕，因此按 PID 排序会给出错误的顺序。

在最坏的情况下（几率：七千万分之一），输出将如下所示：

START

DONE

302 wrote: '19\n'
32731 wrote: '0 1 2 3 4 5 6 7 8 '
32732 wrote: '0\n'
32734 wrote: '1\n'
32735 wrote: '2\n'
32736 wrote: '3\n'
32737 wrote: '4\n'
32738 wrote: '5\n'
32743 wrote: '6\n'
32744 wrote: '7\n'
32745 wrote: '8\n'
32749 wrote: '9\n'
32751 wrote: '10\n'
32752 wrote: '11\n'
32753 wrote: '12\n'
32754 wrote: '13\n'
32756 wrote: '14\n'
32757 wrote: '15\n'
32759 wrote: '16\n'
32760 wrote: '17\n'
32761 wrote: '18\n'

Exception in thread QueueFeederThread (most likely raised during interpreter shutdown):
Traceback (most recent call last):
  File "/usr/lib/python2.6/threading.py", line 532, in __bootstrap_inner
  File "/usr/lib/python2.6/threading.py", line 484, in run
      File "/usr/lib/python2.6/multiprocessing/queues.py", line 233, in _feed
<type 'exceptions.TypeError'>: 'NoneType' object is not callable

在python2.7中异常略有不同：

Exception in thread QueueFeederThread (most likely raised during interpreter shutdown):
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 552, in __bootstrap_inner
  File "/usr/lib/python2.7/threading.py", line 505, in run
  File "/usr/lib/python2.7/multiprocessing/queues.py", line 268, in _feed
<type 'exceptions.IOError'>: [Errno 32] Broken pipe

如何摆脱这些边缘情况？

【问题讨论】：

您要问的实际问题是什么？为什么你会得到这些例外？为什么每个边缘情况都会发生？
@Daniel：如何摆脱这三个问题。我想通过在引言中添加一句话，我让自己更清楚了。这有帮助吗？

标签： python multithreading multiprocessing python-2.6 python-multithreading

【解决方案1】：

解决方案分为两部分。我已经成功运行了 20 万次测试程序，输出没有任何变化。

简单的部分是使用 multiprocessing.current_process()._identity 对消息进行排序。这不是已发布 API 的一部分，但它是每个进程的唯一确定性标识符。这解决了 PID 环绕并给出错误的输出顺序的问题。

解决方案的另一部分是使用 multiprocessing.Manager().Queue() 而不是 multiprocessing.Queue。这解决了上面的问题 #2，因为管理器存在于一个单独的进程中，因此在使用来自拥有进程的队列时避免了一些糟糕的特殊情况。 #3 是固定的，因为 Queue 已完全耗尽，并且在 python 开始关闭并关闭 stdin 之前，feeder 线程自然死亡。

【讨论】：

multiprocessing.Manager().Queue() 而不是 multiprocessing.Queue 摆脱了 python 2.7 中的“: [Errno 32] Broken pipe”错误
@JoshuaRichardson 使用multiprocessing.Manager().Queue() 也为我解决了这个问题。但是我的测试时间大约是 mutliprocessing.queues.Queue() 的 7 倍。
@Bengt：我希望您不会为每个队列指定一个经理。你只需要一个。你能告诉我们一个最小的基准吗？
@JoshuaRichardson 我认为每次在测试中都使用新的管理器对象会更好，因为这样可以消除副作用的可能性，使测试失败时的原因更加明显。对我来说，成本是可以接受的，但其他人可能会发现它非常昂贵，具体取决于队列实例化与其他代码的比例。

【解决方案2】：

我在 Python 2.7 中遇到的multiprocessing 错误比在 Python 2.6 中少得多。话虽如此，我用来避免“Exception in thread QueueFeederThread”问题的解决方案是在使用Queue 的每个进程中暂时使用sleep，可能持续0.01 秒。确实，使用sleep 是不可取的，甚至是不可靠的，但是观察到指定的持续时间在实践中对我来说足够好。你也可以试试0.1s。

【讨论】：

嗜睡症从来都不是可靠的解决方案。