如果要检查recv_ready()，是否必须检查exit_status_ready？答案

【问题标题】：Do you have to check exit_status_ready if you are going to check recv_ready()?如果要检查recv_ready()，是否必须检查exit_status_ready？
【发布时间】：2014-05-06 20:16:45
【问题描述】：

我正在运行一个远程命令：

ssh = paramiko.SSHClient()
ssh.connect(host)
stdin, stdout, stderr = ssh.exec_command(cmd)

现在我想得到输出。我见过这样的事情：

# Wait for the command to finish
while not stdout.channel.exit_status_ready():
    if stdout.channel.recv_ready():
        stdoutLines = stdout.readlines()

但这似乎有时永远不会运行readlines()（即使标准输出上应该有数据）。对我而言，这似乎意味着 stdout.channel.exit_status_ready() 为 True 时，stdout.channel.recv_ready() 不一定准备好（True）。

这样合适吗？

# Wait until the data is available
while not stdout.channel.recv_ready():
    pass

stdoutLines = stdout.readlines()

也就是说，在等待recv_ready()说数据准备好之前，我真的必须先检查退出状态吗？

在无限循环中等待 stdout.channel.recv_ready() 变为 True 之前，我如何知道 stdout 上是否应该有数据（如果不应该有任何 stdout 输出，则不会）？

【问题讨论】：

我尝试做同样的事情：stackoverflow.com/questions/14643861/…。退房。

标签： python paramiko

【解决方案1】：

也就是说，在等待recv_ready()说数据准备好之前，我真的必须先检查退出状态吗？

没有。从远程进程接收数据（例如stdout/stderr）是非常好的，即使它还没有完成。此外，一些 sshd 实现甚至不提供远程 proc 的退出状态，在这种情况下您会遇到问题，请参阅 paramiko doc: exit_status_ready。

等待exit_status_code 以获取短期远程命令的问题在于，您的本地线程接收exit_code 的速度可能比您检查循环条件的速度快。在这种情况下，您永远不会进入循环，并且永远不会调用 readlines()。这是一个例子：

# spawns new thread to communicate with remote
# executes whoami which exits pretty fast
stdin, stdout, stderr = ssh.exec_command("whoami") 
time.sleep(5)  # main thread waits 5 seconds
# command already finished, exit code already received
#  and set by the exec_command thread.
# therefore the loop condition is not met 
#  as exit_status_ready() already returns True 
#  (remember, remote command already exited and was handled by a different thread)
while not stdout.channel.exit_status_ready():
    if stdout.channel.recv_ready():
        stdoutLines = stdout.readlines()

在无限循环中等待 stdout.channel.recv_ready() 变为 True 之前，我如何知道 stdout 上是否应该有数据（如果不应该有任何标准输出输出，则不会） ?

channel.recv_ready() 只是表示缓冲区中有未读数据。

def recv_ready(self):
    """
    Returns true if data is buffered and ready to be read from this
    channel.  A ``False`` result does not mean that the channel has closed;
    it means you may need to wait before more data arrives.

这意味着可能由于网络（延迟的数据包、重新传输......）或只是您的远程进程没有定期写入stdout/stderr 可能导致 recv_ready 为 False。因此，将recv_ready() 作为循环条件可能会导致您的代码过早返回，因为它有时会产生 True（当远程进程写入标准输出并且您的本地通道线程接收到该输出时）并且有时会产生 False（例如，您的远程 proc 在一次迭代中处于休眠状态而不是写入标准输出。

除此之外，人们偶尔会遇到可能与 stdout/stderr buffers filling up 相关的 paramiko 挂起（当您从未从 stdout/stderr 读取数据并且内部缓冲区已填满时，这与 Popen 和挂起 procs 的问题有关）。

下面的代码实现了一个分块解决方案，以便在通道打开时从stdout/stderr 读取清空缓冲区。

def myexec(ssh, cmd, timeout, want_exitcode=False):
  # one channel per command
  stdin, stdout, stderr = ssh.exec_command(cmd) 
  # get the shared channel for stdout/stderr/stdin
  channel = stdout.channel

  # we do not need stdin.
  stdin.close()                 
  # indicate that we're not going to write to that channel anymore
  channel.shutdown_write()      

  # read stdout/stderr in order to prevent read block hangs
  stdout_chunks = []
  stdout_chunks.append(stdout.channel.recv(len(stdout.channel.in_buffer)))
  # chunked read to prevent stalls
  while not channel.closed or channel.recv_ready() or channel.recv_stderr_ready(): 
      # stop if channel was closed prematurely, and there is no data in the buffers.
      got_chunk = False
      readq, _, _ = select.select([stdout.channel], [], [], timeout)
      for c in readq:
          if c.recv_ready(): 
              stdout_chunks.append(stdout.channel.recv(len(c.in_buffer)))
              got_chunk = True
          if c.recv_stderr_ready(): 
              # make sure to read stderr to prevent stall    
              stderr.channel.recv_stderr(len(c.in_stderr_buffer))  
              got_chunk = True  
      '''
      1) make sure that there are at least 2 cycles with no data in the input buffers in order to not exit too early (i.e. cat on a >200k file).
      2) if no data arrived in the last loop, check if we already received the exit code
      3) check if input buffers are empty
      4) exit the loop
      '''
      if not got_chunk \
          and stdout.channel.exit_status_ready() \
          and not stderr.channel.recv_stderr_ready() \
          and not stdout.channel.recv_ready(): 
          # indicate that we're not going to read from this channel anymore
          stdout.channel.shutdown_read()  
          # close the channel
          stdout.channel.close()
          break    # exit as remote side is finished and our bufferes are empty

  # close all the pseudofiles
  stdout.close()
  stderr.close()

  if want_exitcode:
      # exit code is always ready at this point
      return (''.join(stdout_chunks), stdout.channel.recv_exit_status())
  return ''.join(stdout_chunks)

channel.closed 只是通道过早关闭的最终退出条件。在读取一个块之后，代码检查是否已经收到了 exit_status 并且在此期间没有缓冲新数据。如果新数据到达或没有收到 exit_status 代码将继续尝试读取块。一旦远程进程退出并且缓冲区中没有新数据，我们假设我们已经读取了所有内容并开始关闭通道。请注意，如果您想收到退出状态，则应始终等到收到退出状态，否则 paramiko 可能会永远阻塞。

这样可以保证缓冲区不会填满并使您的 proc 挂起。 exec_command 仅在远程命令退出并且我们的本地缓冲区中没有数据时返回。通过使用 select() 而不是在繁忙的循环中轮询，该代码也对 cpu 更加友好，但对于短暂的命令可能会慢一些。

仅供参考，为了防止一些无限循环，可以设置通道超时，当一段时间内没有数据到达时触发

 chan.settimeout(timeout)
 chan.exec_command(command)

【讨论】：

感谢您的详细解释。看来我也有同样的问题。出于某种原因，当退出代码准备好但 stdout/stderr 没有准备好时，我得到了一种情况（在第一次迭代中），因此它甚至没有进入循环。这很奇怪——这怎么可能？你能解释一下代码吗？为什么要在 while & if 中重复检查？ while-check 还不够吗？另外，你为什么要在 while-cycle 之前阅读？似乎同样会自动完成，因为 recv_ready() 将在第一次迭代时为真，不是吗？另外，channel.close 是无证的，对吧？
很好的答案。对我帮助很大，谢谢！你在哪里发现 Paramiko 的 Channel 对象有 in_buffer 数据成员？我在文档中的任何地方都找不到它。
Afaik 它没有记录，可能不打算直接使用。 Channel.__repr__ 也使用它来获取缓冲区的当前大小 (github.com/paramiko/paramiko/blob/master/paramiko/…)。也就是说，我们在使用我们的测试自动化系统中的内置 exec_command 停止 ssh 会话时遇到了重大问题（大量并行 ssh 会话），并通过这个技巧解决了所有问题。
@tintin 在您的示例中，如果用户执行包含 'sudo -S -p "" ls -l 之类的命令，那么由于正在执行的命令中存在 -p ""，它将被卡在等待 for c in readq 循环中的块.知道为什么会发生，以及如何处理吗？
你为什么随机使用channel和stdout.channel？有什么原因，还是只是代码中的剩余部分？

【解决方案2】：

在ssh.exec_command(cmd) 之后添加以下行。只要 shell 脚本正在运行，循环就会继续，并在完成后立即退出。

while int(stdout.channel.recv_exit_status()) != 0:
    time.sleep(1)

【讨论】：