【问题标题】:Dynamically parse BibTeX and create hash of hash动态解析 BibTeX 并创建 hash 的 hash
【发布时间】:2012-11-25 16:54:00
【问题描述】:

我正在尝试解析以下 BibTeX 文件 (bibliography.bib):

@book{Lee2000a,
abstract = {Abstract goes here},
author = {Lee, Wenke and Stolfo, Salvatore J},
title = {{Data mining approaches for intrusion detection}},
year = {2000}
}
@article{Forrest1996,
abstract = {Abstract goes here},
author = {Forrest, Stephanie and Hofmeyr, Steven A. and Anil, Somayaji},
title = {{Computer immunology}},
year = {1996}
}

我正在使用BibTeX::Parser 包,它按预期工作,问题在于创建散列结构的散列。这是我的代码:

#!/usr/bin/perl
# http://search.cpan.org/~gerhard/BibTeX-Parser-0.62/lib/BibTeX/Parser.pm
use BibTeX::Parser;
use IO::File;
use Data::Dumper;
use strict;
use warnings;

my $filename="bibliography.bib";
my (%bibliography, %article);
my $i;
my ($entry, @entries, $type, $key);
my (my $hkey, my $hvalue);

# open BibTeX
my $fh = IO::File->new("$filename") or die "could not open $filename: $!\n";

# create parser object ...
my $parser = BibTeX::Parser->new($fh);

# ... and iterate over entries
while ($entry = $parser->next ) {
  if ($entry->parse_ok) {

    # return BibTeX elements like abstract, author, title ...
    @entries = $entry->fieldlist();

    # create %article as a hash array e.g. year -> 1996; isbn -> 1581138709 etc.
    foreach (@entries) {
      $article{"$_"} = $entry->field("$_");
    }

    # return article's key (Lee2000a, Forrest1996)
    $key = $entry->key;

    # append %article into %bibliography with approporiate key
    $bibliography{"$key"} = \%article;

    #Debug
    #print $entry->key, "\n";
    #print Dumper (\%article);

    # removes all elements of %article (prepare for next iteration)
    %article = ();

    #Debug
    #print "================================\n";
  }

  else {
    warn "Error parsing file: " . $entry->error;
 }
}

    #Debug
    #print Dumper (\%bibliography);

Dumper (\%bibliography) 的当前输出:

$VAR1 = {
          'Lee2000a' => {},
          'Forrest1996' => $VAR1->{'Lee2000a'}
        };

Dumper (\%bibliography) 的期望输出:

$VAR1 = {
          'Lee2000a' => {
                'abstract' => 'Abstract goes here',
                'author' => 'Lee, Wenke and Stolfo, Salvatore J'
                'title' => 'Data mining approaches for intrusion detection'
                'year' => '2000'
              },
          'Forrest1996' => {
                'abstract' => 'Abstract goes here',
                'author' => 'Forrest, Stephanie and Hofmeyr, Steven A. and Anil, Somayaji'
                'title' => 'Computer immunology'
                'year' => '1996'
                }
        };

我做错了什么?非常感谢。

【问题讨论】:

    标签: perl hashtable dynamic-data bibtex


    【解决方案1】:

    试试你的代码没有这行:

    # removes all elements of %article (prepare for next iteration)
    %article = ();
    

    您已将 $bibilography{$key} 设置为对该哈希的引用,然后您将其清空。

    另外,将 %article 的声明移到循环中(可能就在 if ($entry->parse_ok) { 之后,这样它的范围就在你使用它的地方,并且没有必要重新初始化它。

    希望对您有所帮助...

    正在更新以包含排序问题...这应该可以对您的哈希进行排序:

    foreach my $bib_key ( sort keys %bibliography ) {
      print "$bib_key\n";
    
      foreach my $article_key (sort keys %{ $bibliography{$bib_key} }) {
        print "\t $article_key: $bibliography{$bib_key}{$article_key}\n";
      }
    }
    

    【讨论】:

    • 谢谢你们,你们帮了很多忙。您能否还建议我如何首先根据“%bibliography”哈希键(Forrest1996,Lee2000a)然后根据“%article”哈希键(例如作者、摘要、标题、年份)对该结构进行排序。尝试过这样的事情但没有帮助:for $i (sort keys(%bibliography)){ print "$i", "\n"; for $j (sort keys ($i)){ print "$j\n"; } }
    【解决方案2】:

    Dumper 输出

    $VAR1 = { 'Lee2000a' => {}, 'Forrest1996' => $VAR1->{'Lee2000a'} };

    表明您的哈希是共享结构,$bibliography->{Lee2000a}$bibliography->{Forrest1996} 是对相同文章哈希的引用。您的代码在外部范围内有 my %article,循环的每次迭代都会清除并重新填充此共享哈希。

    相反,您希望每次迭代都创建一个 内部文章哈希。移除外部%article 并将其移动到循环中——下面标记为 (+)。删除%article = () 行,这会破坏您刚刚收集的数据。

    while ($entry = $parser->next) {
      if ($entry->parse_ok) {
        # return BibTeX elements like abstract, author, title ...
        @entries = $entry->fieldlist();
    
        # create %article as a hash array e.g. year -> 1996; isbn -> 1581138709 etc.
        my %article;  # (+)
        foreach (@entries) {
          $article{$_} = $entry->field($_);
        }
    
        # return article's key (Lee2000a, Forrest1996)
        $key = $entry->key;
    
        # insert %article into %bibliography with appropriate key
        $bibliography{$key} = \%article;
      }
      else {
        warn "Error parsing file: " . $entry->error;
      }
    }
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2018-03-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2012-04-19
      相关资源
      最近更新 更多