【问题标题】:How do I do this file manipulation in perl?如何在 perl 中进行此文件操作?
【发布时间】:2011-12-15 16:54:30
【问题描述】:

所以我的文件看起来像这样:

--some comments--
--a couple of lines of header info--
 comp:
  name: some_name_A
  type: some_type
  id:   an id_1
  owner: who owns it
  path:  path_A to more data
 end_comp

 comp:
  name: some_name_B
  type: some_type
  id:   an id_2
  owner: who owns it
  path:  path_B to more data
 end_comp  

我想做的事:从名称字段中获取名称,看看它是否与我们要搜索的名称之一匹配(已经在数组中提供),然后获取路径,去那条路,做一些perforce的东西并获得新的id,然后用新的id替换当前的id,只有当它与当前的id不同时。

我做了什么(只是一个伪):

@filedata = <read_file> #read file in an array
$names_to_search = join("|", @some_names);

while(lines=@filedata)
{
 if( $line =~ /comp:/ )
 {
   $line = <next line>;
   if( $line =~ /name: $names_to_search/ )
   {
    #loop until we find the id
    #remember this index since we need to change this id

    #loop until we find the path field
    #get the path, go to that path, do some perforce commands and obtain new id
    if( id is same as current id ) no action required
    else replace current id with new id
   }
  }
}

问题:我当前的实现有三个 while 循环!有没有更好/高效/优雅的方式来做到这一点?

【问题讨论】:

  • 文件是否可以包含两个具有相同name值的块?

标签: performance perl file-io


【解决方案1】:

您以自定义格式编写了一个配置文件,然后尝试手动解析它。相反,为什么不将文件写成 YAML 或 INI 等既定格式,然后使用现有模块进行解析呢?

例如,使用 YAML:

use YAML::Any;
my @data = YAML::Any::LoadFile($filename) or die "Could not read from $filename: $!":

# now you have your data structure in @data; parse it using while/for/map loops.

您可以使用Config::INIConfig::INI::Simple 读取INI 文件。

【讨论】:

  • 正如许多人建议的那样,我将以 xml 格式编写它,然后使用 Perl 的 xml 解析器。 :)
【解决方案2】:

这是一些伪代码:

index = 0;

index_of_id = 0; // this is the index of the line that contains the current company id

have_company = false; // track whether we are processing a copmany

while (line in @filedata)
{
  if (!have_company)
  {
    if (line is not "company") 
    {
      ++index;
      continue;
    }
    else
    {
      index_of_id = 0;
      have_company = true;
    }
  }
  else
  {
    if (line is "end_comp")
    {
      have_company = false; // force to start looking for new company
      ++index;
      continue;
    }

    if (line is "id")
      index_of_id = index;  // save the index

    if (line is "path")
    {
      // do your stuff then replace the string at the index given by index_of_id
    }
  }
  // line index
  ++index; 
}

// Now write the modified array to file

【讨论】:

    【解决方案3】:

    由于没有两个块可以具有相同的name 值,您可以使用哈希引用的哈希引用:

    {
      "name1"=>{type=>"type1",id=>"id1",owner=>"owner1",path=>"path1"},
      "name2"=>{type=>"type2",id=>"id2",owner=>"owner2",path=>"path2"},
      #etc
    }
    

    这样的事情应该可以工作(警告:未经测试):

    use strict;
    use warnings;
    
    open(my $read,"<","input_file.txt") or die $!;
    
    my $data={};
    my $current_name=""; #Placeholder for the name that we're currently using.
    
    while(<$read>)
    {
      chomp; #get rid of trailing newline character.
    
      if(/^\s*name:\s*([\w]+)\s*$/) #If we hit a line specifying a name, 
                                    #then this is the name we're working with
      {
        $current_name=$1;
      }
      elsif(/^\s*(type|id|owner|path):\s*([\w]+)\s*$/) #If it's data to go with the name, 
                                                       #then assign it.
      {
        $data->{$current_name}->{$1}=$2;
      }
    }
    
    close($read);
    
    #Now you can search your given array for each of the names and do what you want from there.
    

    但是,如果可以的话,我真的建议您以某种标准化格式(YAML、INI、JSON、XML 等)将数据存储在您的文件中,然后对其进行适当的解析。我还应该补充一点,此代码取决于出现在相应 typeidownerpath 之前的每个 name

    【讨论】:

      猜你喜欢
      • 2021-12-04
      • 1970-01-01
      • 1970-01-01
      • 2011-04-18
      • 2012-05-29
      • 2013-01-19
      • 2021-02-26
      • 2011-09-14
      • 1970-01-01
      相关资源
      最近更新 更多