读取文件的特定行答案

【问题标题】：Reading a specific line of a file读取文件的特定行
【发布时间】：2010-09-22 00:59:09
【问题描述】：

读取文件特定行的最佳方式（更好的性能）是什么？目前，我正在使用以下命令行：

head -line_number file_name | tail -1

ps.：优先使用shell工具。

【问题讨论】：

标签： shell command-line text

【解决方案1】：

你可以使用sed。

# print line number 10
$ sed -n '10p' file_name
$ sed '10!d' file_name
$ sed '10q;d' file_name

【讨论】：

在我的系统上，您的最后一个示例通常比 awk、head/tail 或 ruby 版本快，除非该行接近文件末尾。当行接近文件末尾时，只有尾部/头部版本开始变得更快。

【解决方案2】：

ruby -ne '$.==10 and (print; exit)' file

【讨论】：

ruby -ne '$.==10 and (print; exit)' 会更快。

【解决方案3】：

#print 10th line
awk NR==10 file_name

【讨论】：

【解决方案4】：

awk -v linenum=10 'NR == linenum {print; exit}' file

【讨论】：

可以在我的回答中使用awk NR==10 file_name。
如果文件很大，你不想阅读其余部分，所以退出。

【解决方案5】：

如果您知道行的长度相同，那么程序可以直接索引到该行而无需读取所有前面的行：像 od 这样的东西可能能够做到这一点，或者您可以将其编码成六打几乎任何语言的线条。查找名为 seek() 或 fseek() 的函数。

否则，也许……

tail +N | head -n 1

...因为这要求 tail 跳到第 N 行，并且不必要地通过管道的行比头到尾解决方案少。

【讨论】：

应该是head 1 而不是head -1。
@Dennis：你确定吗？对于我见过的所有头部实现head 1 会尝试找到一个名为“1”的文件。我已经仔细检查了 GNU/Linux，它肯定是 head -1。你用的是哪个版本的 head？
哦，对不起，GNU head 坚持使用-n 所以head -n 1。我打错了我的评论。使用-n -1 GNU head 会输出除最后一行之外的所有内容，而不仅仅是第一行。版本：头（GNU coreutils）7.4
@DennisWilliamson 啊...我看到 GNU 文档引用了旧的 -1 语法 ala “过时的选项语法 -countoptions”here - 将更新为 -n 1。干杯。

【解决方案6】：

为了避免文件缓存，我尝试了几次，发现 head + tail 很快，但 ruby 是最快的：

$ wc -l myfile.txt
920391 myfile.txt

$  time awk NR==334227 myfile.txt
my_searched_line

real    0m14.963s
user    0m1.235s
sys 0m0.126s

$  time head -334227 myfile.txt |tail -1
my_searched_line

real    0m5.524s
user    0m0.569s
sys 0m0.725s

$ time  sed '334227!d' myfile
my_searched_line

real    0m12.565s
user    0m0.814s
sys 0m0.398s

$ time ruby -ne '$.==334227 and (print; exit)' myfile
my_searched_line

real    0m0.750s
user    0m0.568s
sys 0m0.179s

【讨论】：