在 Ruby 哈希上运行 find 的更有效方法？答案

【问题标题】：More efficient way of running find on a Ruby hash?在 Ruby 哈希上运行 find 的更有效方法？
【发布时间】：2019-06-20 21:34:18
【问题描述】：

我有一个类似这样的哈希数组：

rules = [{
  "id" => "artist_name",
  "type" => "string",
  "field" => "artist_name",
  "input" => "text",
  "value" => "Underoath",
  "operator" => "contains"
},
{
  "id" => "bpm",
  "type" => "integer",
  "field" => "bpm",
  "input" => "number",
  "value" => 100,
  "operator" => "greater"
}]

我正在运行find 来选择我想要的哈希：

rules.find {|h| h['id'] == 'artist_name'}

但我需要为几十个不同的“规则”执行此操作，这感觉有点冗长。

也许它实际上是获取特定“规则”的最有效方法，但我的直觉是有一种 Ruby 方法可能会做得更好。

如果我只需要编写自己的方法，那很好，但想看看有没有我不熟悉的方法。

那么，有没有办法以更紧凑/高效的方式编写rules.find {|h| h['id'] == 'artist_name'}？

【问题讨论】：

您是否要求更快的算法来查找元素？如果您多次调用此方法，则创建一个新的数据结构以加快搜索速度可能是有利的
您可以使用index_by，但它只会在您为同一个键查找多个值时提高性能，例如rules.index_by { |x| x["id"] }.values_at "bpm", "artist_name" 的性能优于 ["bpm", "artist_name"].map { |id| rules.find { |x| x["id"] == id } }
如果您要定期搜索记录，您应该考虑使用数据库。
@max，我的人告诉我 index_by 是 Rails 方法。（没有 Rails 标记。）
我总是忘记这一点

标签： ruby hash

【解决方案1】：

如果您想找到单个值"id" 的哈希值，那么显然您必须执行线性搜索。另一方面，如果您希望对 "id" 的不同值进行多次查找，那么明智的做法是使用适当的键创建散列。

rules = [{
  "id"       => "artist_name",
  "type"     => "string",
  "field"    => "artist_name",
  "input"    => "text",
  "value"    => "Underoath",
  "operator" => "contains"
},
{
  "id"       => "bpm",
  "type"     => "integer",
  "field"    => "bpm",
  "input"    => "number",
  "value"    => 100,
  "operator" => "greater"
}]

h = rules.each_with_index.with_object({}) { |(g,i),h| h[g["id"]] = i }
  #=> {"artist_name"=>0, "bpm"=>1}

h 的值是rules 的元素（哈希）的索引。因此，"id"=>"bpm" 的哈希是

rules[h["bpm"]
  #=> {"id"=>"bpm", "type"=>"integer", "field"=>"bpm", "input"=>"number",
  #    "value"=>100, "operator"=>"greater"}

我本可以将 h 的值设置为 rules 本身的元素，但这需要更多的存储空间，而不会显着缩短查找时间。

从问题中不清楚"id" 的值对于rules 的两个或多个元素是否可能相同。如果是这种情况，h 的键将少于 rules 的元素，但问题的措辞表明，可以选择具有相同值 "id" 的那些中的任何散列。

【讨论】：

【解决方案2】：

您可以尝试按id 进行分组，并将其设为要查找的哈希键。例如

grouped_rules = rules.group_by { |r| r["id"] }
# => {"artist_name"=>[{"id"=>"artist_name", "type"=>"string", "field"=>"artist_name", "input"=>"text", "value"=>"Underoath", "operator"=>"contains"}], "bpm"=>[{"id"=>"bpm", "type"=>"integer", "field"=>"bpm", "input"=>"number", "value"=>100, "operator"=>"greater"}]}

grouped_rules["artist_name"]
# => [{"id"=>"artist_name", "type"=>"string", "field"=>"artist_name", "input"=>"text", "value"=>"Underoath", "operator"=>"contains"}]

这绝对更紧凑和可读，但它是否“高效”取决于您打算如何测量/评估它。这并没有提高内存效率，但如果您以前经常使用 find 应该会更快。

【讨论】：