【问题标题】:How to modify arrays of hashes in ruby 2.1.2?如何修改 ruby​​ 2.1.2 中的哈希数组?
【发布时间】:2017-08-16 16:22:41
【问题描述】:

我有一个名为 array_of_hash 的哈希数组:

array_of_hash = [
 {:name=>"1", :address=>"USA", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"},
 {:name=>"5", :address=>"UK", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"BC"},
 {:name=>"6", :address=>"CANADA", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"CD"},
 {:name=>"29", :address=>"GERMANY", :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"DE"},
 {:name=>"30", :address=>"CHINA", :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"FG"}
]

我希望通过键 :name 的连续值对这些散列进行分组。第一组将是单独的"1",因为:name => "1".succ #=> "2" 没有密钥。第二组将包含值为"5""6" 的哈希值。第三组是数组中的最后两个哈希值,:name=>29:name=>30

我想要的哈希数组应该是这样的:

[
   {:name=>"1", :address=>"USA", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"},
   {:name=>"5-6", :address=>"UK,CANADA", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"BC,CD"},
   {:name=>"29-30", :address=>"GERMANY,CHINA", :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"DE, FG"},
]

用例二

array_of_hash = [
 {:name=>"1", :address=>"USA", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"},
 {:name=>"2", :address=>"UK", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"BC"},
 {:name=>"3", :address=>"CANADA", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"CD"},
 {:name=>"29", :address=>"GERMANY", :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"DE"},
 {:name=>"30", :address=>"CHINA", :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"FG"}
]

用例 II 的期望结果

[
   {:name=>"1-3", :address=>"USA,UK,CANADA", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB,BC,CD"},
   {:name=>"29-30", :address=>"GERMANY,CHINA", :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"DE, FG"},
]

到目前为止我做了什么:

new_array_of_hashes = []
new_array_of_hashes << { name: array_of_hashes.map {|h| h[:name].to_i}} << {address: array_of_hashes.map {|h| h[:address]}} << {collection: array_of_hashes.map {|h| h[:collection]}} << {sequence: array_of_hashes.map {|h| h[:sequence]}}

[{:name=>[1, 5, 6, 29, 30]},
 {:address=>["USA", "UK", "CANADA", "GERMANY", "CHINA"]},
 {:collection=>
[["LAND", "WATER", "OIL", "TREE", "SAND"],
["LAND", "WATER", "OIL", "TREE", "SAND"],
["LAND", "WATER", "OIL", "TREE", "SAND"],
["LAPTOP", "SHIP", "MOUNTAIN"],
["LAPTOP", "SHIP", "MOUNTAIN"]]},
 {:sequence=>["AB", "BC", "CD", "DE", "FG"]}]

我只能结合它。

【问题讨论】:

  • 你如何确定哪些元素要组合,哪些要分开?
  • @moveson 如果:collection 值相同,则合并
  • 这样的话,前三个元素不应该全部合并吗?
  • 这是最困难的部分。如果它们是按顺序排列的,它们将被组合,否则它们应该分开。
  • 请阅读“minimal reproducible example”。您对应该加入的内容的解释没有意义。 1、5 和 6 具有匹配的 :collection 数组,因此它们应该全部组合,但您的示例反驳了这一点。

标签: arrays ruby hash ruby-2.1


【解决方案1】:

代码

def aggregate(array_of_hash)
  array_of_hash.chunk_while { |g,h| h[:name] == g[:name].succ }.
    flat_map { |a| a.chunk { |g| g[:collection] }.map { |_c,b| combine(b) } }
end

def combine(arr)
  names     = values_for_key(arr, :name)
  addresses = values_for_key(arr, :address)
  sequences = values_for_key(arr, :sequence)
  arr.first.merge {
    name: names.size==1 ? names.first : "%s-%s" % [names.first, names[-1]],
    address:  addresses.join(','),
    sequence: sequences.join(',')
  }
end

def values_for_key(arr, key)
  arr.map { |h| h[key] }
end

示例

aggregate(array_of_hash)
  #=> [{:name=>"1", :address=>"USA",
  #     :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"},
  #    {:name=>"5-6", :address=>"UK,CANADA",
  #     :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"BC,CD"},
  #    {:name=>"29-30", :address=>"GERMANY,CHINA",
  #     :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"DE,FG"}]   

这是第二个例子。

array_of_hash[2][:collection] = ['dog', 'cat', 'pig']
  #=> [{:name=>"1", :address=>"USA",
  #     :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"},
  #    {:name=>"5", :address=>"UK",
  #     :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"BC"},
  #    {:name=>"6", :address=>"CANADA",
  #     :collection=>["dog", "cat", "pig"], :sequence=>"CD"},
  #    {:name=>"29-30", :address=>"GERMANY,CHINA",
  #     :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"DE,FG"}]

在此示例中,:name=&gt;"5":name=&gt;"6" 的哈希值无法分组,因为 :collection 的值不同。问题没有说明这种情况是否会发生。如果不能,代码仍然是正确的,但可以简化为以下。

def aggregate(array_of_hash)
  array_of_hash.chunk_while { |g,h| h[:name] == g[:name].succ }.
    map { |a| combine(a) }
end

说明

对于上面的例子,步骤如下。

e0 = array_of_hash.chunk_while { |g,h| h[:name] == g[:name].succ }
  #=> #<Enumerator: #<Enumerator::Generator:0x007fa25e022f30>:each>

参见Enumerable#chunk_while,它在 Ruby v.2.3 中首次亮相。

此枚举器将生成​​以下要传递给Enumerable#flat_map的元素。

e0.to_a
  #=> [[{:name=>"1", :address=>"USA",
  #      :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"}],
  #    [{:name=>"5", :address=>"UK",
  #      :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"BC"},
  #     {:name=>"6", :address=>"CANADA",
  #      :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"CD"}],
  #    [{:name=>"29", :address=>"GERMANY",
  #      :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"DE"},
  #     {:name=>"30", :address=>"CHINA",
  #      :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"FG"}]
  #   ] 

e0.flat_map { |a| a.chunk { |g| g[:collection] }.map { |_,b| combine(b) } }

返回示例中获得的哈希数组。考虑由e0 生成并传递给块并由flat_map 分配给块变量的第一个元素。

a = e0.next
  #=> [{:name=>"1", :address=>"USA",
  #     :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"}] 

因此块计算是

e1 = a.chunk { |g| g[:collection] }
  #=> #<Enumerator: #<Enumerator::Generator:0x007fa25c857158>:each> 
e1.to_a
  #=> [[["LAND", "WATER", "OIL", "TREE", "SAND"],
  #     [{:name=>"1", :address=>"USA",
  #       :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"}]
  #    ]
  #   ] 

_c,b = e1.next
  #=> [["LAND", "WATER", "OIL", "TREE", "SAND"],
  #    [{:name=>"1", :address=>"USA",
  #      :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"}]
  #   ] 
  # _c
  #   #=> ["LAND", "WATER", "OIL", "TREE", "SAND"] 
  # b #=> [{:name=>"1", :address=>"USA",
  #         :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"}] 
combine(b)
  #=> {:name=>"1", :address=>"USA",
  #    :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"}

其余的计算类似。

【讨论】:

    【解决方案2】:
    def slice_when(array)
      big = []
      small = []
      last_index = array.size - 1
      (0..last_index).each do |i|
        small << array[i]
        if last_index == i || yield(array[i], array[i + 1])
          big << small
          small = []
        end
      end
      big
    end
    

    如果您不想使用slice_before,可以尝试使用它。请记住,它已经返回 Array,而不是 Enumurator

    【讨论】:

      【解决方案3】:

      首先,让我们创建一个我们最终想要的组的数组。我们将使用 Ruby 的 Array#slice_when 方法,它使用当前和下一个数组元素迭代一个数组,允许我们比较两者。如果名称(转换为整数)不连续或集合不相同,我们的条件将指示 Ruby 对数组进行切片。

      >> groups = array_of_hash.slice_when { |i, j| i[:name].to_i + 1 != j[:name].to_i || i[:collection] != j[:collection] }.to_a
      

      但是因为您使用的是 Ruby 2.1,所以您需要使用 slice_before 并使用局部变量来跟踪以前的元素。根据documentation,我们可以通过首先启动一个局部变量来完成此操作:

      >> prev = array_of_hash[0]
      

      然后在我们遍历数组时重置它和第二个局部变量:

      >> groups = array_of_hash.slice_before { |e| prev, prev2 = e, prev; prev2[:name].to_i + 1 != prev[:name].to_i || prev2[:collection] != prev[:collection] }.to_a
      

      无论哪种情况,groups 现在应该如下所示:

      => [[{:name=>"1",
         :address=>"USA",
         :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"],
         :sequence=>"AB"}],
       [{:name=>"5",
         :address=>"UK",
         :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"],
         :sequence=>"BC"},
        {:name=>"6",
         :address=>"CANADA",
         :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"],
         :sequence=>"CD"}],
       [{:name=>"29",
         :address=>"GERMANY",
         :collection=>["LAPTOP", "SHIP", "MOUNTAIN"],
         :sequence=>"DE"},
        {:name=>"30",
         :address=>"CHINA",
         :collection=>["LAPTOP", "SHIP", "MOUNTAIN"],
         :sequence=>"FG"}]]
      

      现在我们获取结果数组并将其元素映射到一个新的散列,按照您指定的格式设置。

      对于:name,我们取组的第一个和最后一个元素,调用.uniq 来消除重复项,并用连字符连接它们。 (如果只存在一个元素,join 将返回不变的单个元素。)

      对于:collection,我们只需使用在组的第一个元素中找到的集合。

      对于:sequence,我们用逗号连接组中每个元素的序列。 (同样,单个元素原封不动地返回。)

      >> groups.map { |group| {name: [group.first[:name], group.last[:name]].uniq.join('-'), 
                               collection: group.first[:collection], 
                               sequence: group.map { |e| e[:sequence] }.join(',') } }
      
      => [{:name=>"1",
        :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"],
        :sequence=>"AB"},
       {:name=>"5-6",
        :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"],
        :sequence=>"BC,CD"},
       {:name=>"29-30",
        :collection=>["LAPTOP", "SHIP", "MOUNTAIN"],
        :sequence=>"DE,FG"}]
      

      【讨论】:

      • 谢谢,但我的 ruby​​ 版本是 2.1.2。 when_slice 我猜只适用于 2.2。
      • 你没有办法升级? Ruby 2.1 现在已经 3 年多了。没有slice_when,代码会更复杂。
      • 是的,这就是我无法升级的问题。有没有类似when_slice的方法?
      • 这很乱,但你可以用slice_before 做到这一点。请参阅上面的更新答案。
      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2012-04-12
      • 2013-09-26
      • 1970-01-01
      • 1970-01-01
      • 2011-12-07
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多