二进制“尾部”文件答案

【问题标题】：Binary "tail" a file二进制“尾部”文件
【发布时间】：2010-09-18 22:24:49
【问题描述】：

我猜这个网站上的大多数人都熟悉 tail，如果不熟悉的话 - 它提供了一种“跟随”模式，当文本附加到文件 tail 时，会将这些字符转储到终端。

我正在寻找的（如果有必要可能会自己写）是一个适用于二进制文件的 tail 版本。基本上，我有一个无线链接，当文件从另一个网络链接下来时，我想将它涓涓细流。查看尾部源代码，重写它不会太难，但我宁愿不重新发明轮子！严格来说，这不会是“尾部”，因为我希望复制整个文件，但它会观察添加新字节并流式传输这些字节。

想法？

【问题讨论】：

标签： c tail gnu-coreutils gnu-fileutils

【解决方案1】：

通过管道将其传送到 hexdump：

tail -f somefile | hexdump -C

【讨论】：

我自己也不是 100% 确定，但我试过了，效果很好。
tail -f 不会只在二进制文件中看到换行符时才输出新数据吗？我怀疑它不会缓冲它的标准输出。
好点克里斯，我没想到。所以我现在刚刚在 Debian 上对其进行了测试，是的，如果流中没有换行符，它仍然可以工作，尽管这种行为在不同的平台上可能会有所不同。
使用 hexdump 是一个红鲱鱼，不是吗？或者也许只是一个发送二进制数据的地方的说明。我没有看到任何要求 hexdump 的问题，仅此而已......
注意 hexdump 仅以 16 字节为增量记录到屏幕。

【解决方案2】：

还有bintail 应用程序，它似乎比上述脚本更强大。

bintail 包包含一个应用程序bintail。该程序从磁盘读取一个普通文件，并将输出逐字节传输到标准输出，不进行任何转换，类似于 tail(1) 对文本文件所做的操作。这对于实时写入时“拖尾”二进制文件（例如 WAV 文件）很有用。这个应用程序正在进行中，但它已经完成了它为我设计的任务。

【讨论】：

谢谢，这正是我需要的，将输出从“tcpflow”重定向到 nodejs 流:) 它不适用于“tail -f”。
在 linux binutils tail -c +1 -f somefile 中也可以正常工作。

【解决方案3】：

Linux coreutils tail(1) 在二进制文件上工作得很好。对于大多数应用程序，您只需要避免它的行方向，这样输出就不会从数据结构中间的某个随机点开始。您可以通过简单地从文件开头开始来做到这一点，这也正是您所要求的：

tail -c +1 -f somefile

工作得很好。

【讨论】：

【解决方案4】：

这个用于 Windows 的仓促编码的 Python 脚本可能会有所帮助：

# bintail.py -- reads a binary file, writes initial contents to stdout,
# and writes new data to stdout as it is appended to the file.

import time
import sys
import os
import msvcrt
msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)

# Time to sleep between file polling (seconds)
sleep_int = 1

def main():
    # File is the first argument given to the script (bintail.py file)
    binfile = sys.argv[1]

    # Get the initial size of file
    fsize = os.stat(binfile).st_size

    # Read entire binary file
    h_file = open(binfile, 'rb')
    h_bytes = h_file.read(128)
    while h_bytes:
        sys.stdout.write(h_bytes)
        h_bytes = h_file.read(128)
    h_file.close()


    # Loop forever, checking for new content and writing new content to stdout
    while 1:
        current_fsize = os.stat(binfile).st_size
        if current_fsize > fsize:
            h_file = open(binfile, 'rb')
            h_file.seek(fsize)
            h_bytes = h_file.read(128)
            while h_bytes:
                sys.stdout.write(h_bytes)
                h_bytes = h_file.read(128)
            h_file.close()
            fsize = current_fsize
        time.sleep(sleep_int)

if __name__ == '__main__':
    if len(sys.argv) == 2:
        main()
    else:
        sys.stdout.write("No file specified.")

【讨论】：

【解决方案5】：

less somefile

然后按shift F

【讨论】：

我不会退出看看如何使用 less 重定向到文件输出并按 Shift+F...

【解决方案6】：

严格来说，您需要编写一个程序来执行此操作，因为tail 未指定用于处理二进制文件。如果您想尽快接收新的“涓涓”数据，可能还需要避免缓冲问题。

【讨论】：

好吧，再看一遍，我看到你标记了你的问题 gnu-coreutils。因此，如果您知道您将使用 tail 的 gnu 实现，它可能是二进制安全的，并且可能没有缓冲问题（检查并查看）。

【解决方案7】：

这不是尾巴——这是逐步复制文件。看看rsync。

【讨论】：

我想知道这个答案是否被接受，其中有两个答案更符合问题：stackoverflow.com/a/6173419/1353930 stackoverflow.com/a/6171491/1353930。 rsync 在这里没有用，因为它不能流式传输数据。它仅限于磁盘上的（相对静态的）文件
@Daniel Alder。 rsync 可以重新运行，只发送新数据。
流媒体意味着您在运行 cgi 脚本或通过管道传输到 netcat 等。但这更多是 @Goyuix 的问题

【解决方案8】：

我也使用它，因为它也适用于直播：

cat ./some_file_or_dev | hexdump -C

转储我的按键（和释放）的示例：

[user@localhost input]$ sudo cat /dev/input/event2 | hexdump -C
00000000  81 32 b1 5a 00 00 00 00  e2 13 02 00 00 00 00 00  |.2.Z............|
00000010  04 00 04 00 36 00 00 00  81 32 b1 5a 00 00 00 00  |....6....2.Z....|
00000020  e2 13 02 00 00 00 00 00  01 00 36 00 01 00 00 00  |..........6.....|
00000030  81 32 b1 5a 00 00 00 00  e2 13 02 00 00 00 00 00  |.2.Z............|
00000040  00 00 00 00 00 00 00 00  81 32 b1 5a 00 00 00 00  |.........2.Z....|
00000050  a3 af 02 00 00 00 00 00  04 00 04 00 36 00 00 00  |............6...|
00000060  81 32 b1 5a 00 00 00 00  a3 af 02 00 00 00 00 00  |.2.Z............|
^C

【讨论】：

hexdump 的问题在于它会等待 16 字节的倍数来打印每一行，但只要您意识到这个问题，就可以了。
这可能是因为 -C 开关（ascii 列）

【解决方案9】：

我使用这个命令（1 表示要解释的字节数）：尾-f | od -x1

【讨论】：