用 perl 比较文件中的行答案

【问题标题】：Comparing lines in a file with perl用 perl 比较文件中的行
【发布时间】：2014-10-08 09:18:38
【问题描述】：

我一直在尝试比较两个文件之间的行以及相同的匹配行。

出于某种原因，下面的代码只经过“text1.txt”的第一行并打印“if”语句，而不管两个变量是否匹配。

谢谢

use strict;
open( <FILE1>, "<text1.txt" );
open( <FILE2>, "<text2.txt" );
foreach my $first_file (<FILE1>) {
    foreach my $second_file (<FILE2>) {
        if ( $second_file == $first_file ) {
            print "Got a match - $second_file + $first_file";
        }
    }
}
close(FILE1);
close(FILE2);

【问题讨论】：

标签： perl file match

【解决方案1】：

如果您比较字符串，请使用eq 运算符。 "==" 以数字方式比较参数。

【讨论】：

【解决方案2】：

如果您的文件不太大，这是一种完成工作的方法。

#!/usr/bin/perl
use Modern::Perl;
use File::Slurp qw(slurp);
use Array::Utils qw(:all);
use Data::Dumper;

# read entire files into arrays
my @file1 = slurp('file1');
my @file2 = slurp('file2');

# get the common lines from the 2 files
my @intersect = intersect(@file1, @file2);

say Dumper \@intersect;

【讨论】：

【解决方案3】：

一种更好更快（但内存效率更低）的方法是将一个文件读入散列，然后在散列表中搜索行。这样每个文件只检查一次。

# This will find matching lines in two files,
# print the matching line and it's line number in each file.

use strict;

open (FILE1, "<text1.txt") or die "can't open file text1.txt\n";
my %file_1_hash;
my $line;
my $line_counter = 0;

#read the 1st file into a hash 
while ($line=<FILE1>){
  chomp ($line); #-only if you want to get rid of 'endl' sign
  $line_counter++;
  if (!($line =~ m/^\s*$/)){
    $file_1_hash{$line}=$line_counter;
  }
}
close (FILE1);

#read and compare the second file
open (FILE2,"<text2.txt") or die "can't open file text2.txt\n";
$line_counter = 0;
while ($line=<FILE2>){
  $line_counter++;
  chomp ($line);
  if (defined $file_1_hash{$line}){
    print "Got a match: \"$line\"
in line #$line_counter in text2.txt and line #$file_1_hash{$line} at text1.txt\n";
  }
}
close (FILE2);

【讨论】：

【解决方案4】：

您必须重新打开或重置文件 2 的指针。将 open 和 close 命令移至循环内。

根据文件和行的大小，一种更有效的方法是只遍历文件一次，并将文件 1 中出现的每一行保存在哈希中。然后检查文件 2 中的每一行是否存在该行。

【讨论】：

【解决方案5】：

如果你想要行数，

my $count=`grep -f [FILE1PATH] -c [FILE2PATH]`;

如果你想要匹配的行，

my @lines=`grep -f [FILE1PATH]  [FILE2PATH]`;

如果你想要不匹配的行，

my @lines = `grep -f [FILE1PATH] -v [FILE2PATH]`;

【讨论】：

【解决方案6】：

这是我编写的一个脚本，它试图查看两个文件是否相同，尽管可以通过修改代码并将其切换到 eq 轻松修改它。正如蒂姆建议的那样，使用哈希可能会更有效，尽管您无法确保在不使用 CPAN 模块的情况下按照插入文件的顺序比较文件（并且如您所见，这种方法实际上应该使用两个循环，但这对我的目的来说已经足够了）。这不是有史以来最棒的脚本，但它可能会给你一个开始的地方。


use warnings;

open (FILE, "orig.txt") or die "Unable to open first file.\n";
@data1 = ;
close(FILE);

open (FILE, "2.txt") or die "Unable to open second file.\n";
@data2 = ;
close(FILE);

for($i = 0; $i < @data1; $i++){
    $data1[$i] =~ s/\s+$//;
    $data2[$i] =~ s/\s+$//;
    if ($data1[$i] ne $data2[$i]){
        print "Failure to match at line ". ($i + 1) . "\n";
        print $data1[$i];
        print "Doesn't match:\n";
        print $data2[$i];
        print "\nProgram Aborted!\n";
        exit;
    }
}

print "\nThe files are identical. \n";

【讨论】：

【解决方案7】：

获取您发布的代码，并将其转换为实际的 Perl 代码，这就是我想出的。

use strict;
use warnings;
use autodie;

open my $fh1, '<', 'text1.txt';
open my $fh2, '<', 'text2.txt';

while(
  defined( my $line1 = <$fh1> )
  and
  defined( my $line2 = <$fh2> )
){
  chomp $line1;
  chomp $line2;

  if( $line1 eq $line2 ){
    print "Got a match - $line1\n";
  }else{
    print "Lines don't match $line1 $line2"
  }
}

close $fh1;
close $fh2;

现在您可能真正想要的是两个文件的差异，最好留给Text::Diff。

use strict;
use warnings;

use Text::Diff;

print diff 'text1.txt', 'text2.txt';

【讨论】：