您也许可以混合和匹配正则表达式和代码 -
$line =~ /(?{($cnt,@ary)=(0,)})^(?:([^,]+)(?{push @ary,$cnt; push @ary,$^N})|,(?{$cnt++}))+/x
and print join( ',', @ary);
扩展 -
$line =~ /
(?{($cnt,@ary)=(0,)})
^(?:
([^,]+) (?{push @ary,$cnt; push @ary,$^N})
| , (?{$cnt++})
)+
/x
and print join( ',', @ary);
一些基准测试
稍微调整一下 flesk 和 sln(寻找 fleskNew 和 slnNew),
当替换运算符被删除时,获胜者是 fleskNew。
代码 -
use Benchmark qw( cmpthese ) ;
$samp = "x,,10.3,,q,,5.2,3.1,,,ghy,g,,l,p";
$line = $samp;
cmpthese( -5, {
flesk1 => sub{
$index = 0;
join ",",
map {join ",", @$_}
grep $_->[1],
map {[$index++, $_]}
split ",", $line;
},
flesk2 => sub{
($i, @vars) = (0,);
while ($line =~ s/^(,*)([^,]+)//) {
push @vars, $i += length($1), $2;
}
$line = $samp;
},
fleskNew => sub{
($i, @vars) = (0,);
while ($line =~ /(,*)([^,]+)/g) {
push @vars, $i += length($1), $2;
}
},
sln1 => sub{
$line =~ /
(?{($cnt,@ary)=(0,)})
^(?:
([^,]+) (?{push @ary,$cnt; push @ary,$^N})
| , (?{$cnt++})
)+
/x
},
slnNew => sub{
$line =~ /
(?{($cnt,@ary)=(0,)})
(?:
(,*) (?{$cnt += length($^N)})
([^,]+) (?{push @ary, $cnt,$^N})
)+
/x
},
} );
数字 -
Rate flesk1 sln1 flesk2 slnNew fleskNew
flesk1 20325/s -- -51% -52% -56% -60%
sln1 41312/s 103% -- -1% -10% -19%
flesk2 41916/s 106% 1% -- -9% -17%
slnNew 45978/s 126% 11% 10% -- -9%
fleskNew 50792/s 150% 23% 21% 10% --
一些基准测试 2
增加了Birei的在线更换和修剪(多合一)解决方案。
缩写:
修改 Flesk1 以删除最终的“连接”,因为它不包含在
其他正则表达式解决方案。这让它有机会更好地替补。
Birei 在替补席上出现偏差,因为它将原始字符串修改为最终解决方案。
那个方面是不能去掉的。 Birei1 和 BireiNew 的区别在于
新的删除最后的','。
Flesk2、Birei1 和 BireiNew 具有恢复原始字符串的额外开销
由于替换运算符。
获胜者看起来仍然像 FleskNew ..
代码 -
use Benchmark qw( cmpthese ) ;
$samp = "x,,10.3,,q,,5.2,3.1,,,ghy,g,,l,p";
$line = $samp;
cmpthese( -5, {
flesk1a => sub{
$index = 0;
map {join ",", @$_}
grep $_->[1],
map {[$index++, $_]}
split ",", $line;
},
flesk2 => sub{
($i, @vars) = (0,);
while ($line =~ s/^(,*)([^,]+)//) {
push @vars, $i += length($1), $2;
}
$line = $samp;
},
fleskNew => sub{
($i, @vars) = (0,);
while ($line =~ /(,*)([^,]+)/g) {
push @vars, $i += length($1), $2;
}
},
sln1 => sub{
$line =~ /
(?{($cnt,@ary)=(0,)})
^(?:
([^,]+) (?{push @ary,$cnt; push @ary,$^N})
| , (?{$cnt++})
)+
/x
},
slnNew => sub{
$line =~ /
(?{($cnt,@ary)=(0,)})
(?:
(,*) (?{$cnt += length($^N)})
([^,]+) (?{push @ary, $cnt,$^N})
)+
/x
},
Birei1 => sub{
$i = -1;
$line =~
s/
(?(?=,+)
( (?: , (?{ ++$i }) )+ )
| (?<no_comma> [^,]+ ,? ) (?{ ++$i })
)
/
defined $+{no_comma} ? $i . qq[,] . $+{no_comma} : qq[]
/xge;
$line = $samp;
},
BireiNew => sub{
$i = 0;
$line =~
s/
(?: , (?{++$i}) )*
(?<data> [^,]* )
(?: ,*$ )?
(?= (?<trailing_comma> ,?) )
/
length $+{data} ? "$i,$+{data}$+{trailing_comma}" : ""
/xeg;
$line = $samp;
},
} );
结果 -
Rate BireiNew Birei1 flesk1a flesk2 sln1 slnNew fleskNew
BireiNew 6030/s -- -18% -74% -85% -86% -87% -88%
Birei1 7389/s 23% -- -68% -82% -82% -84% -85%
flesk1a 22931/s 280% 210% -- -44% -45% -51% -54%
flesk2 40933/s 579% 454% 79% -- -2% -13% -17%
sln1 41752/s 592% 465% 82% 2% -- -11% -16%
slnNew 47088/s 681% 537% 105% 15% 13% -- -5%
fleskNew 49563/s 722% 571% 116% 21% 19% 5% --