有没有更好的方法来计算字符串中 char 的出现次数？答案

【问题标题】：Is there a better way to count occurrence of char in a string?有没有更好的方法来计算字符串中 char 的出现次数？
【发布时间】：2016-03-29 23:32:53
【问题描述】：

我觉得必须有更好的方法来计算出现次数，而不是在 perl 中编写 sub，在 Linux 中编写 shell。

#/usr/bin/perl -w
use strict;
return 1 unless $0 eq __FILE__;
main() if $0 eq __FILE__;
sub main{
    my $str = "ru8xysyyyyyyysss6s5s";
    my $char = "y";
    my $count = count_occurrence($str, $char);
    print "count<$count> of <$char> in <$str>\n";
}
sub count_occurrence{
    my ($str, $char) = @_;
    my $len = length($str);
    $str =~ s/$char//g;
    my $len_new = length($str);
    my $count = $len - $len_new;
    return $count;
}

【问题讨论】：

标签： perl sh

【解决方案1】：

可以用 Perl 中的一行来计算字符串中字符的出现次数（与您的 4 行相比）。不需要 sub（尽管在 sub 中封装功能并没有错）。来自perlfaq4 "How can I count the number of occurrences of a substring within a string?"

use warnings;
use strict;

my $str = "ru8xysyyyyyyysss6s5s";
my $char = "y";
my $count = () = $str =~ /\Q$char/g;
print "count<$count> of <$char> in <$str>\n";

【讨论】：

【解决方案2】：

如果字符是常量，最好如下：

my $count = $str =~ tr/y//;

如果字符是可变的，我会使用以下内容：

my $count = length( $str =~ s/[^\Q$char\E]//rg );

如果我想与早于 5.14 的 Perl 版本兼容（因为它速度较慢且使用更多内存），我只会使用以下内容：

my $count = () = $str =~ /\Q$char/g;

以下不占用内存，但可能有点慢：

my $count = 0;
++$count while $str =~ /\Q$char/g;

【讨论】：

【解决方案3】：

漂亮* Bash/Coreutils/Grep 单线：

$ str=ru8xysyyyyyyysss6s5s
$ char=y
$ fold -w 1 <<< "$str" | grep -c "$char"
8

或许

$ grep -o "$char" <<< "$str" | wc -l
8

第一个仅在子字符串只有一个字符长时才有效；第二个仅在子字符串不重叠时才有效。

_{* 不是真的。}

【讨论】：

两者都很好，我把它们加到我的工具箱里了，第一次听到cmd折叠，感激！
@gliang：有点像 Bash 的split //，也是最近才发现的。很高兴您发现它们很有用！

【解决方案4】：

toolic 给出了正确答案，但您可能会考虑不要对您的值进行硬编码以使程序可重用。

use strict;
use warnings;

die "Usage: $0 <text> <characters>" if @ARGV < 1;
my $search = shift;                    # the string you are looking for
my $str;                               # the input string
if (@ARGV && -e $ARGV[0] || !@ARGV) {  # if str is file, or there is no str
    local $/;                          # slurp input
    $str = <>;                         # use diamond operator
} else {                               # else just use the string
    $str = shift;
}
my $count = () = $str =~ /\Q$search\E/gms;
print "Found $count of '$search' in '$str'\n";

这将允许您使用程序来计算字符串、文件或标准输入中出现的字符或字符串。例如：

count.pl needles haystack.txt
some_process | count.pl foo
count.pl x xyzzy

【讨论】：

gms 是干什么用的？ g - 表示所有出现，m 表示匹配，s 表示 ?
@gliang: "Treat string as single line"，即. 匹配换行符，它通常不匹配，见perldoc.perl.org/perlre.html#Modifiers
Benjamin，谢谢，这不是我想的那样。我会再读一遍文档。
@gliang 不，m 是多行的，这意味着换行符与 ^ 和 $（和等效项）匹配，而 s 就像 Benjamin 所说的那样使 . 也匹配换行符。您的示例中并不严格需要它们，并且 \Q 无论如何都排除了 . 元字符的使用。但由于我们不知道您的输入是什么，因此这是一种更安全的方式。