也就是说,在等待recv_ready()说数据准备好之前,我真的必须先检查退出状态吗?
没有。从远程进程接收数据(例如stdout/stderr)是非常好的,即使它还没有完成。此外,一些 sshd 实现甚至不提供远程 proc 的退出状态,在这种情况下您会遇到问题,请参阅 paramiko doc: exit_status_ready。
等待exit_status_code 以获取短期远程命令的问题在于,您的本地线程接收exit_code 的速度可能比您检查循环条件的速度快。在这种情况下,您永远不会进入循环,并且永远不会调用 readlines()。这是一个例子:
# spawns new thread to communicate with remote
# executes whoami which exits pretty fast
stdin, stdout, stderr = ssh.exec_command("whoami")
time.sleep(5) # main thread waits 5 seconds
# command already finished, exit code already received
# and set by the exec_command thread.
# therefore the loop condition is not met
# as exit_status_ready() already returns True
# (remember, remote command already exited and was handled by a different thread)
while not stdout.channel.exit_status_ready():
if stdout.channel.recv_ready():
stdoutLines = stdout.readlines()
在无限循环中等待 stdout.channel.recv_ready() 变为 True 之前,我如何知道 stdout 上是否应该有数据(如果不应该有任何标准输出输出,则不会) ?
channel.recv_ready() 只是表示缓冲区中有未读数据。
def recv_ready(self):
"""
Returns true if data is buffered and ready to be read from this
channel. A ``False`` result does not mean that the channel has closed;
it means you may need to wait before more data arrives.
这意味着可能由于网络(延迟的数据包、重新传输......)或只是您的远程进程没有定期写入stdout/stderr 可能导致 recv_ready 为 False。因此,将recv_ready() 作为循环条件可能会导致您的代码过早返回,因为它有时会产生 True(当远程进程写入标准输出并且您的本地通道线程接收到该输出时)并且有时会产生 False(例如,您的远程 proc 在一次迭代中处于休眠状态而不是写入标准输出。
除此之外,人们偶尔会遇到可能与 stdout/stderr buffers filling up 相关的 paramiko 挂起(当您从未从 stdout/stderr 读取数据并且内部缓冲区已填满时,这与 Popen 和挂起 procs 的问题有关)。
下面的代码实现了一个分块解决方案,以便在通道打开时从stdout/stderr 读取清空缓冲区。
def myexec(ssh, cmd, timeout, want_exitcode=False):
# one channel per command
stdin, stdout, stderr = ssh.exec_command(cmd)
# get the shared channel for stdout/stderr/stdin
channel = stdout.channel
# we do not need stdin.
stdin.close()
# indicate that we're not going to write to that channel anymore
channel.shutdown_write()
# read stdout/stderr in order to prevent read block hangs
stdout_chunks = []
stdout_chunks.append(stdout.channel.recv(len(stdout.channel.in_buffer)))
# chunked read to prevent stalls
while not channel.closed or channel.recv_ready() or channel.recv_stderr_ready():
# stop if channel was closed prematurely, and there is no data in the buffers.
got_chunk = False
readq, _, _ = select.select([stdout.channel], [], [], timeout)
for c in readq:
if c.recv_ready():
stdout_chunks.append(stdout.channel.recv(len(c.in_buffer)))
got_chunk = True
if c.recv_stderr_ready():
# make sure to read stderr to prevent stall
stderr.channel.recv_stderr(len(c.in_stderr_buffer))
got_chunk = True
'''
1) make sure that there are at least 2 cycles with no data in the input buffers in order to not exit too early (i.e. cat on a >200k file).
2) if no data arrived in the last loop, check if we already received the exit code
3) check if input buffers are empty
4) exit the loop
'''
if not got_chunk \
and stdout.channel.exit_status_ready() \
and not stderr.channel.recv_stderr_ready() \
and not stdout.channel.recv_ready():
# indicate that we're not going to read from this channel anymore
stdout.channel.shutdown_read()
# close the channel
stdout.channel.close()
break # exit as remote side is finished and our bufferes are empty
# close all the pseudofiles
stdout.close()
stderr.close()
if want_exitcode:
# exit code is always ready at this point
return (''.join(stdout_chunks), stdout.channel.recv_exit_status())
return ''.join(stdout_chunks)
channel.closed 只是通道过早关闭的最终退出条件。在读取一个块之后,代码检查是否已经收到了 exit_status 并且在此期间没有缓冲新数据。如果新数据到达或没有收到 exit_status 代码将继续尝试读取块。一旦远程进程退出并且缓冲区中没有新数据,我们假设我们已经读取了所有内容并开始关闭通道。请注意,如果您想收到退出状态,则应始终等到收到退出状态,否则 paramiko 可能会永远阻塞。
这样可以保证缓冲区不会填满并使您的 proc 挂起。 exec_command 仅在远程命令退出并且我们的本地缓冲区中没有数据时返回。通过使用 select() 而不是在繁忙的循环中轮询,该代码也对 cpu 更加友好,但对于短暂的命令可能会慢一些。
仅供参考,为了防止一些无限循环,可以设置通道超时,当一段时间内没有数据到达时触发
chan.settimeout(timeout)
chan.exec_command(command)