如何将数组转换为哈希，变量名称映射为 Perl 中的键？答案

【问题标题】：How to convert an array into a hash, with variable names mapped as keys in Perl?如何将数组转换为哈希，变量名称映射为 Perl 中的键？
【发布时间】：2012-04-12 10:07:47
【问题描述】：

我发现自己在 perl 中经常使用这种模式

sub fun {
    my $line = $_[0];
    my ( $this, $that, $the_other_thing ) = split /\t/, $line;
    return { 'this' => $this, 'that' => $that, 'the_other_thing' => $the_other_thing};
}

显然，我可以通过返回将给定变量数组转换为映射的函数的输出来简化此模式，其中键与变量名称相同，例如

sub fun {
    my $line = $_[0];
    my ( $this, $that, $the_other_thing ) = split /\t/, $line;
    return &to_hash( $this, $that, $the_other_thing );
}

随着元素数量的增加，它会有所帮助。我该怎么做呢？看起来我可以结合 PadWalker 和闭包，但我想要一种仅使用核心语言的方法。

编辑：thb 为这个问题提供了一个聪明的解决方案，但我没有检查它，因为它绕过了很多困难的部分（tm）。如果您想依赖核心语言的解构语义并将您的反射从实际变量中排除，您会怎么做？

EDIT2：这是我暗示使用 PadWalker 和闭包的解决方案：

use PadWalker qw( var_name );

# Given two arrays, we build a hash by treating the first set as keys and
# the second as values
sub to_hash {
    my $keys = $_[0];
    my $vals = $_[1];
    my %hash;
    @hash{@$keys} = @$vals;
    return \%hash;
}

# Given a list of variables, and a callback function, retrieves the
# symbols for the variables in the list.  It calls the function with
# the generated syms, followed by the original variables, and returns
# that output.
# Input is: Function, var1, var2, var3, etc....
sub with_syms {
    my $fun = shift @_;
    my @syms = map substr( var_name(1, \$_), 1 ), @_;
    $fun->(\@syms, \@_);
}

sub fun {
    my $line = $_[0];
    my ( $this, $that, $other) = split /\t/, $line;
    return &with_syms(\&to_hash, $this, $that, $other);
}

【问题讨论】：

你的问题是......？
哎呀，对不起。你在编辑过程中抓住了我。 :)

标签： perl stringify

【解决方案1】：

除了自己解析 Perl 代码之外，to_hash 函数仅使用核心语言是不可行的。被调用的函数不知道这些 args 是否是变量、来自其他函数的返回值、字符串文字，或者你有什么......更不用说它们的名字了。它并不关心，不应该关心。

【讨论】：

同意，尽管该信息是否应该通过反射暴露给程序员是有争议的。 Lisp 宏一直利用这一点来提供新的定义函数。我怀疑大多数 OO 语言也会通过反射库向用户公开这一点。

【解决方案2】：

这样做：

my @part_label = qw( part1 part2 part3 );

sub fun {
    my $line = $_[0];
    my @part = split /\t/, $line;
    my $no_part = $#part_label <= $#part ? $#part_label : $#part;
    return map { $part_label[$_] => $part[$_] } (0 .. $no_part);
}

当然，您的代码必须在某处命名这些部分。上面的代码是通过 qw()， 完成的，但如果你愿意，你可以让你的代码自动生成名称。

[如果您预计 *part_labels 的列表会非常大，* 那么您应该避免使用 *(0 .. $no_part)* 习语，但对于中等大小的列表，它可以正常工作。]

针对 OP 的以下评论进行更新：您提出了一个有趣的挑战。我喜欢。以下内容与您想要的结果有多接近？

sub to_hash ($$) {
    my @var_name = @{shift()};
    my @value    = @{shift()};
    $#var_name == $#value or die "$0: wrong number of elements in to_hash()\n";
    return map { $var_name[$_] => $value[$_] } (0 .. $#var_name);
}

sub fun {
    my $line = $_[0];
    return to_hash [qw( this that the_other_thing )], [split /\t/, $line];
}

【讨论】：

不错！但这滥用了每个变量名称都是 name_N 的想法。我不能这样假设。我将编辑问题以使其更清楚。
等等……说得太早了。这确实工作。你为什么要让我逍遥法外？
我不知道你所说的“逃跑”，但看看我上面添加的替代方案是否不适合。

【解决方案3】：

如果我理解正确，您想通过将给定的键序列分配给从数据记录中拆分出来的值来构建哈希。

这段代码似乎可以解决问题。如果我误解了你，请解释一下。

use strict;
use warnings;

use Data::Dumper;
$Data::Dumper::Terse++;

my $line = "1111 2222 3333 4444 5555 6666 7777 8888 9999\n";

print Dumper to_hash($line, qw/ class division grade group kind level rank section tier  /);

sub to_hash {
  my @fields = split ' ', shift;
  my %fields = map {$_ => shift @fields} @_;
  return \%fields;
}

输出

{
  'division' => '2222',
  'grade' => '3333',
  'section' => '8888',
  'tier' => '9999',
  'group' => '4444',
  'kind' => '5555',
  'level' => '6666',
  'class' => '1111',
  'rank' => '7777'
}

对于从任意两个列表构建散列的更通用的解决方案，我建议zip_by function from List::UtilsBy

use strict;
use warnings;

use List::UtilsBy qw/zip_by/;
use Data::Dumper;
$Data::Dumper::Terse++;

my $line = "1111 2222 3333 4444 5555 6666 7777 8888 9999\n";

my %fields = zip_by { $_[0] => $_[1] }
    [qw/ class division grade group kind level rank section tier  /],
    [split ' ', $line];

print Dumper \%fields;

输出与我最初的解决方案相同。

另请参阅 List::MoreUtils 中的 pairwise 函数，它采用一对数组而不是数组引用列表。

【讨论】：

嗯，很多时候这部分模式：my ( $this, $that, $the_other_thing ) = split /\t/, $line;可与匹配表达式互换。 (prog21.dadgum.com/131.html)
@user787747 我已添加到我的答案中以进一步抽象解决方案。

【解决方案4】：

您可以使用PadWalker 来尝试获取变量的名称，但这确实不是您应该做的事情。它是脆弱的和/或限制性的。

相反，您可以使用哈希切片：

sub fun {
   my ($line) = @_;
   my %hash;
   @hash{qw( this that the_other_thing )} = split /\t/, $line;
   return \%hash;
}

如果您愿意，可以在函数 to_hash 中隐藏切片。

sub to_hash {
   my $var_names = shift;
   return { map { $_ => shift } @$var_names };
}

sub fun_long {
   my ($line) = @_;
   my @fields = split /\t/, $line;
   return to_hash [qw( this that the_other_thing )] @fields;
}

sub fun_short {
   my ($line) = @_;
   return to_hash [qw( this that the_other_thing )], split /\t/, $line;
}

但如果你坚持，这里是 PadWalker 版本：

use Carp      qw( croak );
use PadWalker qw( var_name );

sub to_hash {
   my %hash;
   for (0..$#_) {
      my $var_name = var_name(1, \$_[$_])
         or croak("Can't determine name of \$_[$_]");
      $hash{ substr($var_name, 1) } = $_[$_];
   }
   return \%hash;
}

sub fun {
   my ($line) = @_;
   my ($this, $that, $the_other_thing) = split /\t/, $line;
   return to_hash($this, $that, $the_other_thing);
}

【讨论】：

@user787747，修复了一个错误并添加了完全符合您要求的版本，尽管它是不明智的。
我开始认为核心语言没有提供执行此操作所需的工具，否则它们现在已经问世了。如果我选择使用 Padwalker，为什么在这种非常有限的情况下会被认为是不明智的？
...好吧，除了作者自己的警告。
在这种有限的情况下，不必要地使用复杂的代码，这些代码会进入不应该进入的区域，并且可能会在新版本的 Perl 中中断。通过引入不必要的代码重复和不直观的结果，导致代码不必要的复杂化。