我对你的程序很感兴趣,并编写了一个高度动态的 Perl 程序
打印任何用户定义文件的每一行中单词的匹配或不匹配,然后将匹配或不匹配文件的请求行正确地打印到屏幕和新的用户定义的输出文件。
我们将解析这个文件:iris_dataset.csv:
"Sepal.Length","Sepal.Width","Petal.Length","Petal.Width","Species"
5.1,3.5,1.4,0.2,"setosa"
4.9,3,1.4,0.2,"setosa"
4.8,3,1.4,0.3,"setosa"
5.1,3.8,1.6,0.2,"setosa"
4.6,3.2,1.4,0.2,"setosa"
7,3.2,4.7,1.4,"versicolor"
6.4,3.2,4.5,1.5,"versicolor"
6.9,3.1,4.9,1.5,"versicolor"
6.6,3,4.4,1.4,"versicolor"
5.5,2.4,3.7,1,"versicolor"
6.3,3.3,6,2.5,"virginica"
5.8,2.7,5.1,1.9,"virginica"
7.1,3,5.9,2.1,"virginica"
6.3,2.9,5.6,1.8,"virginica"
5.9,3,5.1,1.8,"virginica"
这是一个逗号分隔值文件,其中的列用逗号分隔。
如果您在电子表格中查看此文件,则可以更好地查看每一列项目。我们将要查找的是文件的种类,因此可能要匹配的项目是“setosa”、“versicolor”和“virginica”。
我的程序首先询问您要从中读取的文件..
在这种情况下,它是 iris_dataset.csv,尽管它可以是任何文件。然后你写一个你想写的文件的名字。我称它为 new_iris.csv,但你可以称它为任何名称。
然后我们告诉程序我们要查找多少个项目,所以如果有 3 个项目我可以输入:setosa、versicolor、virginica,顺序不限。如果有两个我只能输入两个项目,如果有一个,那么我只能在这个示例文件中输入 setosa 或 versicolor 或 virginica。
然后我们被问到是否要保留与我们的项目匹配的行,
或者如果我们想删除与我们的文件匹配的文件行。如果我们保留匹配项,我们会将与这些项目匹配的行打印到屏幕和我们的输出文件中。如果我们选择删除,我们会得到与这些项目不匹配的行打印到屏幕和我们的文件中。如果我们既不选择 KEEP 也不选择 REMOVE,那么我们会收到一条错误消息,并且我们的新空 outfile 将被删除,因为它不包含任何内容。
#!/usr/bin/env perl
# Program: perl_matching.pl
use strict; # Means that we have to explicitly declare our variables with "my", "our" or "local" as we want their scope defined.
use warnings; # We want to know if and if where errors are showing up in our program.
use feature 'say'; # Like print, but with automatic ending newline.
use feature 'switch'; # Perl given:when switch statement.
no warnings 'experimental'; # Perl has something against switch.
########### This block of code right here is basically equivalent to a unit ls command ##############
opendir(DIR, "."); # Opens the current working directory
my @files = readdir(DIR); # Reads all files in the current working directory into an array @files.
closedir(DIR); # Now that we have the array of files, we can close our current working directory.
say "Here are the list of files in your current working directory";
foreach(@files){print "$_\t";} # $_ is the default variable for each item in an array.
########### It is not critical to run the program ####################
say "\nGive me your filename to read from, extensions and all ..."; # It would be a good idea to have your filename in yoru working directory.
chomp(my $file_read = <STDIN>); # This makes the filename dynamic from user input.
say "Give me your filename to write to, extensions and all ...";
chomp(my $file_write = <STDIN>); # results will be printed to this file, and standard output. # chomp removes newlines from standard input.
# ' < ' to read from, and '>', to write to ...
# Opening your file to read from:
open(my $filehandle_read, '<', $file_read) or die "Problem reading file $_ because $!";
# Open your file to write to.
open(my $filehandle_write, '>', $file_write) or die "Problem reading file $_ because $!";
say "How many matches are you going to give me?";
my $match_num = <STDIN>;
say "Okay give me the matches now, pressing Enter key between each match.";
my $i = 1; # This is our incrementer between matches.
my $matches; # This is each match presented line by line.
my @match_list; # This is our array (list) of $matches
while($i <= $match_num)
{
$matches = <STDIN>; # One match at a time from standard input.
push @match_list, $matches; # Pushes all individual $matches into a list @match_list
$i = $i + 1; # Increase the incrementor by one so this loop don't last forever.
}
chomp(@match_list);
undef($matches); # I am clearing each match, so that I can redefine this variable.
$matches = join('|', @match_list); # " | " is part of a regular expression which means "or" for each item in this scalar matches.
say "This is what your redefined matches variable looks like: $matches";
say "Now you get a choice for your matches";
say "KEEP or REMOVE?"; # if you type Keep (case insensitive) you print only the matches to the new file. If you type Remove (case insensitive) you print only the lines to the newfile which do not contain the matches.
chomp(my $choice = <STDIN>);
my @lines_all = <$filehandle_read>; # The filehandle contains everything in the file, so we can pull all lines of the file to read into an array, where each item in the array is each line of the file opened for reading.
close $filehandle_read; # we can now close the filehandle for the file for reading since we just pulled all the information from it.
# We grep for the matching " =~ " lines of our file to read.
my @lines_matching = grep{$_ =~ m/$matches/} @lines_all;
# We grep for the non-matching " !~ " lines of our file to read.
# Note: $_ is a default variable for every item in the array.
my @lines_not_matching = grep{$_ !~ m/$matches/} @lines_all;
# This is a Perl style switch statement.
# Note: A given::when::when::default switch statement.
# is basically equivalent to ...
# while::if::elsif::else statement.
# In this switch statement only one choice is performed,
# which one depends on if you said "Keep" or "Remove" in your choice.
given($choice)
{
when($choice =~ m/Keep/i) # "i" is for case-insensitive, so Keep, KEEP, kEeP, etc are valid.
{
say @lines_matching; # Print the matching lines to the screen.
print $filehandle_write @lines_matching; # Print the matching lines to the file.
close $filehandle_write; # Close the file now that we are done with it.
}
when($choice =~ m/Remove/i)
{
say @lines_not_matching; # Print the lines that match to the screen.
print $filehandle_write @lines_not_matching; # Print the lines that do not match to the screen.
close $filehandle_write; # Close the file now that we are done with it.
}
default
{
say "You must have selected a choice other than Keep or Remove. Don't do that!";
close $filehandle_write; # Close the file now that we are done with it.
unlink($file_write) or warn "Could not unlink file $file_write"; # If you selected neither keep nor remove, we delete the new file to write to as it contains nothing.
}
}
下面是正在运行的脚本:
我要求删除包含 versicolor 和 setosa 的行,因此只有包含 virginica 的行才会打印到屏幕和我称为 new_iris.csv 的输出文件中。再次,我要了 2 个项目。注意:在我的程序中,您可以以任何不区分大小写的方式键入单词 Keep 或 Remove。
>perl perl_matching.pl
Here are the list of files in your current working directory
. .. iris_dataset.csv perl_matching.pl
Give me your filename to read from, extensions and all ...
iris_dataset.csv
Give me your filename to write to, extensions and all ...
new_iris.csv
How many matches are you going to give me?
2
Okay give me the matches now, pressing Enter key between each match.
setosa
versicolor
This is what your redefined matches variable looks like: setosa|versicolor
Now you get a choice for your matches
KEEP or REMOVE?
Remove
"Sepal.Length","Sepal.Width","Petal.Length","Petal.Width","Species"
6.3,3.3,6,2.5,"virginica"
5.8,2.7,5.1,1.9,"virginica"
7.1,3,5.9,2.1,"virginica"
6.3,2.9,5.6,1.8,"virginica"
5.9,3,5.1,1.8,"virginica"
所以只有那些不包含 setosa 和 versicolor 的行会被打印到我们的文件中:new_iris.csv:
"Sepal.Length","Sepal.Width","Petal.Length","Petal.Width","Species"
6.3,3.3,6,2.5,"virginica"
5.8,2.7,5.1,1.9,"virginica"
7.1,3,5.9,2.1,"virginica"
6.3,2.9,5.6,1.8,"virginica"
5.9,3,5.1,1.8,"virginica"
我非常喜欢在 Perl 中使用标准输入。
您可以使用我的脚本仅打印文件中包含的行
塞托萨(您只要求 1 场比赛。)