读取并找到文件大小为 1 GB 的行 [关闭]答案

【问题标题】：Read and find the line in the file size of 1 GB [closed]读取并找到文件大小为 1 GB 的行 [关闭]
【发布时间】：2011-12-10 19:55:28
【问题描述】：

我需要读取大于 1 GB 的文本文件以查找特定行。这应该用 Perl、PHP 或 Java 编写。此方法不应加载服务器。

有哪些方法可以做到？

【问题讨论】：

请参阅download.oracle.com/javase/tutorial/essential/io/file.html 了解有关在 Java 中读取文件的信息。
这是一道考试题还是什么？

标签： java php perl text-files

【解决方案1】：

这里不多，但创建一个BufferedReader，一次读取一行并检查它是否是您要查找的行。

【讨论】：

【解决方案2】：

如果你有“合适的工具做合适的工作”的态度，并且愿意学习新工具，perl、awk 甚至 sed 都是非常适合这类工作的工具。否则，任何完整的语言都可以，Java 也可以完成这项工作。但是使用缓冲类，比如 BufferedReader，否则会非常慢。

perl 中的示例：

use strict;
use warnings;

open INFILE, "<infile" or die;
open OUTFILE, ">outfile" or die;
while(<INFILE>) {
  $_=~s/source-regex/replace-with/g;
  print OUTFILE;
}

我单线可以工作，但有点复杂。

【讨论】：

复杂？ perl -nwe 'print if /source-regex/' input.txt > output.txt
哎呀，我知道会有某种替代品。这个单线就可以了。谢谢。
替换并不复杂。只需将m// 更改为s/// 并打印。同样的土豆。

【解决方案3】：

在 perl 中：

use strict;
use warnings;

my $line = 'what to be searched';
open my $fh, '<', '/path/to/the/file' or die "unable to open file: $!";
while(<$fh>) {
    chomp;
    if ($_ eq $line) {
        print "found $line at line $.\n";
        last;
    }
}

【讨论】：

【解决方案4】：

作为单行：

perl -nwe 'print if /source-regex/' input.txt > output.txt

作为脚本：

use strict;
use warnings;

while (<>) {
    print if /source-regex/;
}

用法：perl script.pl input.txt > output.txt

有很多方法可以优化这一点，但使用您提供的信息无法做更多的事情。搜索需要一些时间，并且可能会很慢，具体取决于您的正则表达式。

如果您有安全问题，显式打开文件更安全：

open my $input, '<', shift or die $!;
while (<$input>) { 
...

【讨论】：