【问题标题】:Compute numerical values from a space separated text file, within a range of lines从空格分隔的文本文件中计算数值,在行的范围内
【发布时间】:2009-04-22 05:41:55
【问题描述】:

我有一个包含以下值的文件:

for 3 threads:
Average time taken for API1 is: 19097.7 nanoseconds.
Average time taken for API2 is: 19173.1 nanoseconds.
Average time taken for API2 is: 19777.7 nanoseconds.
Average time taken for API2 is: 19243.1 nanoseconds.
Average time taken for API1 is: 19737.7 nanoseconds.
Average time taken for API2 is: 19128.1 nanoseconds.
for 5 threads:
Average time taken for API1 is: 19097.7 nanoseconds.
Average time taken for API2 is: 19173.1 nanoseconds.
Average time taken for API2 is: 19777.7 nanoseconds.
...

我希望计算 1API 行和 2API 行的总和,并将它们相加。 另一个要求是我还想单独计算每个线程。有没有办法使用 perl、sed、awk 或仅使用 shell 脚本来做到这一点?

我目前能得到的是:

cat result | grep API1 | awk {'print $7'}

【问题讨论】:

    标签: perl math shell text


    【解决方案1】:

    您可以使用 grep 和 awk 的组合。 grep 仅选择包含数据的行(API 所在的位置)和 awk 进行计数。

    grep API file | awk '{ arr[$5]+=$7 } END {for (i in arr) {print i,arr[i]}   } ' -
    

    (用文件名更改文件或删除以从标准输入读取)

    如果你想计算不同的总和,你可以这样做

    awk '{ if($1 == "for") id = $2; else arr[id $5]+=$7 } END {for (i in arr) {print i,arr[i]}   } ' testfile
    

    输出:

    5API1 19097.7
    5API2 38950.8
    3API1 38835.4
    3API2 77322
    

    【讨论】:

      【解决方案2】:

      短且不可读:

      perl -lane 'END{&h}sub h{print"\t$_ => $h{$_}"for keys%h;%h=()}&h,print,next if/^for/;$h{$F[4]}+=$F[6]' data
      

      可读,但必须是脚本:

      #!/usr/bn/perl
      
      use strict;
      use warnings;
      
      my %counts;
      my $thread = "undefined";
      while (<>) {
          if (/^for ([0-9]+)/) {
              $thread = $1;
              next;
          }
          my ($item, $time) = /for (\S+) is: (\S+) nano/;
          $counts{$thread}{$item} += $time;
      }
      
      for my $thread (sort { $a <=> $b } keys %counts) {
          print "for $thread threads:\n";
          for my $item (sort keys %{$counts{$thread}}) {
              print "\t$item => $counts{$thread}{$item}\n";
          }
      }
      

      【讨论】:

      • 你好,这不满足给定不同线程数有不同总和的要求。
      • @Alnitak 对我来说看起来不错,我在提供的数据上测试了代码,您认为有什么问题?
      【解决方案3】:

      我不明白您的最后一个要求(没有指定线程),但我会为您提供该信息的设置并满足我可以的要求理解。数据已分解,因此您可以访问它。虽然我不明白你如何使用 'for x threads:' 行,但它至少被捕获,所以你可以使用它。

      use List::Util qw<sum>;
      
      my $fh = FileHandle->new( PATH_TO_DATAFILE );
      my $data 
          = { trial_times => []
            , totals      => {}
            };
      my $precision = 0;
      
      while ( <$fh> ) { 
          if ( m/^for (\d+) threads:/ ) { 
              push @{$data->{trial_times}}, {};
          }
          elsif ( m/^Average time taken for (API\w+) is: (\d+\.(\d+)) nanoseconds./ ) {
              push @{$data->{trial_times}[-1]{$1}}, $2;
              push @{ $data->{totals}->{$1} }, $2;
              $precision = length $3 if length $3 > $precision;
          }
      }
      
      ### $data
      
      foreach my $api ( keys %{ $data->{totals} } ) { 
          my @list = @{ $data->{totals}{$api} };
          my $sum  =sum @list;
      
          printf "Sum for %d runs of API $api: %0.${precision}f (Average: %0.${precision}f)\n"
               , scalar @list, $sum, $sum / scalar @list
               ;
      }
      
      my @combined = map { @$_ } values %{$data->{totals}};
      ### @combined
      my $sum      = sum @combined;
      printf "Combined %d runs for %0.${precision}f total (Average: %0.${precision}f)\n"
          , scalar @combined, $sum, $sum / scalar @combined
          ;
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2021-03-21
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2016-11-02
        • 1970-01-01
        • 1970-01-01
        • 2013-08-04
        相关资源
        最近更新 更多