【问题标题】：Reading stdout from a subprocess in real time实时读取stdout进程
【发布时间】：2018-03-17 11:10:56
【问题描述】：

让我们考虑一下这个 sn-p：

from subprocess import Popen, PIPE, CalledProcessError


def execute(cmd):
    with Popen(cmd, shell=True, stdout=PIPE, bufsize=1, universal_newlines=True) as p:
        for line in p.stdout:
            print(line, end='')

    if p.returncode != 0:
        raise CalledProcessError(p.returncode, p.args)

base_cmd = [
    "cmd", "/c", "d:\\virtual_envs\\py362_32\\Scripts\\activate",
    "&&"
]
cmd1 = " ".join(base_cmd + ['python -c "import sys; print(sys.version)"'])
cmd2 = " ".join(base_cmd + ["python -m http.server"])

如果我运行execute(cmd1)，输出将毫无问题地打印出来。

但是，如果我运行 execute(cmd2) 而不是什么都不会打印，为什么会这样以及如何修复它以便我可以实时查看 http.server 的输出。

另外，如何在内部评估 for line in p.stdout？它是某种无限循环，直到达到标准输出 eof 还是什么？

这个话题已经在 SO 中讨论过几次，但我还没有找到 Windows 解决方案。上面的 sn-p 实际上是来自 answer 的代码，并试图从 virtualenv 运行 http.server（win7 上的 python3.6.2-32bits）

【问题讨论】：

标签： python windows subprocess sublimetext3 popen

【解决方案1】：

如果你想从一个正在运行的子进程中连续读取，你必须使那个进程的输出没有缓冲。您的子进程是一个 Python 程序，这可以通过将 -u 传递给解释器来完成：

python -u -m http.server

这是它在 Windows 盒子上的样子。

【讨论】：

对我有用，虽然我不想为此启动一个 Windows 盒子，因此删除了你的 cmd /c ... 东西。
流和缓冲在 Windows 和其他操作系统上的工作方式或多或少相同。请参阅我更新的答案中的屏幕截图。你能重现这个吗？
@fpbhb 你可以删除" ".join，因为它在列表中的单个元素上执行时什么都不做。

【解决方案2】：

使用此代码，由于缓冲，您无法看到实时输出：

for line in p.stdout:
    print(line, end='')

但是如果你使用p.stdout.readline() 它应该可以工作：

while True:
  line = p.stdout.readline()
  if not line: break
  print(line, end='')

详见对应的python bug discussion

UPD：在这里，您可以在 stackoverflow 上找到几乎相同的 problem with various solutions。

【讨论】：

【解决方案3】：

我认为主要问题是http.server 以某种方式将输出记录到stderr，这里我有一个asyncio 的示例，从stdout 或stderr 读取数据。

我的第一次尝试是使用 asyncio，这是一个很好的 API，它从 Python 3.4 开始就存在了。后来我找到了一个更简单的解决方案，所以你可以选择，两个都可以。

asyncio 作为解决方案

在后台 asyncio 正在使用 IOCP - 一个用于异步内容的 Windows API。

# inspired by https://pymotw.com/3/asyncio/subprocesses.html

import asyncio
import sys
import time

if sys.platform == 'win32':
    loop = asyncio.ProactorEventLoop()
    asyncio.set_event_loop(loop)

async def run_webserver():
    buffer = bytearray()

    # start the webserver without buffering (-u) and stderr and stdin as the arguments
    print('launching process')
    proc = await asyncio.create_subprocess_exec(
        sys.executable, '-u', '-mhttp.server',
        stdout=asyncio.subprocess.PIPE,
        stderr=asyncio.subprocess.PIPE
    )

    print('process started {}'.format(proc.pid))
    while 1:
        # wait either for stderr or stdout and loop over the results
        for line in asyncio.as_completed([proc.stderr.readline(), proc.stdout.readline()]):
            print('read {!r}'.format(await line))

event_loop = asyncio.get_event_loop()
try:
    event_loop.run_until_complete(run_df())
finally:
    event_loop.close()

从标准输出重定向

根据您的示例，这是一个非常简单的解决方案。它只是将标准错误重定向到标准输出，并且只读取标准输出。

from subprocess import Popen, PIPE, CalledProcessError, run, STDOUT import os

def execute(cmd):
    with Popen(cmd, stdout=PIPE, stderr=STDOUT, bufsize=1) as p:
        while 1:
            print('waiting for a line')
            print(p.stdout.readline())

cmd2 = ["python", "-u", "-m", "http.server"]

execute(cmd2)

【讨论】：

【解决方案4】：

p.stdout 中的 for line 如何在内部进行评估？在到达 stdout eof 之前是某种无限循环吗？

p.stdout 是一个缓冲区（阻塞）。当您从 empty 缓冲区读取数据时，您会被阻塞，直到有内容写入该缓冲区。一旦里面有东西，你就会得到数据并执行内部部分。

想想tail -f 在 linux 上是如何工作的：它一直等到有内容写入文件，然后在屏幕上回显新数据。没有数据时会发生什么？ 它等待。所以当你的程序到达这一行时，它会等待数据并处理它。

由于您的代码有效，但当作为模型运行时，它必须以某种方式与此相关。 http.server 模块可能会缓冲输出。尝试将 -u 参数添加到 Python 以将进程作为无缓冲运行：

-u : 无缓冲的二进制标准输出和标准错误； PYTHONUNBUFFERED=x 有关与“-u”相关的内部缓冲的详细信息，请参见手册页

另外，您可能想尝试将循环更改为 for line in iter(lambda: p.stdout.read(1), ''):，因为这会在处理前一次读取 1 字节。

更新：完整的循环代码是

for line in iter(lambda: p.stdout.read(1), ''):
    sys.stdout.write(line)
    sys.stdout.flush()

此外，您将命令作为字符串传递。尝试将其作为列表传递，每个元素都在自己的插槽中：

cmd = ['python', '-m', 'http.server', ..]

【讨论】：

@BPL 我已经更新了我的答案以包含循环内容，以及您可以尝试的另一个建议

【解决方案5】：

您可以在操作系统级别实现无缓冲行为。

在 Linux 中，您可以使用 stdbuf 包装现有命令行：

stdbuf -i0 -o0 -e0 YOURCOMMAND

或者在 Windows 中，您可以使用 winpty 包装现有命令行：

winpty.exe -Xallow-non-tty -Xplain YOURCOMMAND

我不知道这方面的操作系统中立工具。

【讨论】：