【问题标题】:Delete duplicated elements in an array that's a value in a hash and its corresponding ids删除数组中的重复元素,该元素是哈希值及其对应的 id
【发布时间】:2019-09-20 08:56:21
【问题描述】:

我有一个哈希值,它是一个数组。如何以最高效的方式删除数组中的重复元素和相应的 id?

这是我的哈希示例

hash = { 
  "id" => "sjfdkjfd",
  "name" => "Field Name",
  "type" => "field",
  "options" => ["Language", "Question", "Question", "Answer", "Answer"],
  "option_ids" => ["12345", "23456", "34567", "45678", "56789"]
}

我的想法是这样的

hash["options"].each_with_index { |value, index |
  h = {}
  if h.key?(value)
    delete(value)
    delete hash["option_ids"].delete_at(index)
  else 
    h[value] = index
  end
}

结果应该是

hash = { 
  "id" => "sjfdkjfd",
  "name" => "Field Name",
  "type" => "field",
  "options" => ["Language", "Question", "Answer"],
  "option_ids" => ["12345", "23456", "45678"]
}

我知道我必须考虑到,当我删除 options 和 option_ids 的值时,这些值的索引会发生变化。但不知道该怎么做

【问题讨论】:

  • {"Language" => "12345", "Question" => "23456", "Answer" => "45678"} 是否有理由不受欢迎?
  • 是的,这会更有意义,但这是给我的问题。
  • “重复元素”是什么意思? 2(以及1)是[1,2,2,3,1]中的重复元素吗?
  • C.,从技术上讲,在回答 @engineersmnky 的问题时,我认为您实际上的意思是“不”(没有理由)。 :-) 你说,“结果应该是...hash = {...”。这有点令人困惑。如果你写了hash #=> {...,那意味着你想修改现有的哈希hash。如果您只写了{...,则意味着(除非您另有说明)您希望创建一个新的散列并保持现有散列不变。提问时,一般规则是不修改输入对象(也称为 mutated),除非提问者明确声明要修改它们。
  • @CarySwoveland 是的,duplicated 会是更好的措辞 :) 感谢您的 cmets 和帮助!

标签: ruby algorithm hash


【解决方案1】:

我的第一个想法是压缩值并调​​用 uniq,然后想办法返回到初始形式:

h['options'].zip(h['option_ids']).uniq(&:first).transpose
#=> [["Language", "Question", "Answer"], ["12345", "23456", "45678"]]


然后,通过并行分配:
h['options'], h['option_ids'] = h['options'].zip(h['option_ids']).uniq(&:first).transpose

h #=> {"id"=>"sjfdkjfd", "name"=>"Field Name", "type"=>"field", "options"=>["Language", "Question", "Answer"], "option_ids"=>["12345", "23456", "45678"]}

这些是步骤:

h['options'].zip(h['option_ids'])
#=> [["Language", "12345"], ["Question", "23456"], ["Question", "34567"], ["Answer", "45678"], ["Answer", "56789"]]

h['options'].zip(h['option_ids']).uniq(&:first)
#=> [["Language", "12345"], ["Question", "23456"], ["Answer", "45678"]]

【讨论】:

  • 哇,从现在开始我真的很喜欢 zip :) 另外,转置然后真的很好!很好的解释。
  • 这太棒了。谢谢!
  • @CarySwoveland,谢谢!我采纳了明智的建议。
【解决方案2】:
hash = { 
  "id" => "sjfdkjfd",
  "name" => "Field Name",
  "type" => "field",
  "options" => ["L", "Q", "Q", "Q", "A", "A", "Q"],
  "option_ids" => ["12345", "23456", "34567", "dog", "45678", "56789", "cat"]
}

我假设“重复元素”指的是连续的相等元素(2 仅在 [1,2,2,1] 中)而不是“重复元素”(12 在前面的示例中)。如果第二种解释适用,我确实展示了如何更改代码(实际上是简化)。

idx = hash["options"].
  each_with_index.
  chunk_while { |(a,_),(b,_)| a==b }.
  map { |(_,i),*| i }
  #=> [0, 1, 4, 6]

hash.merge(
  ["options", "option_ids"].each_with_object({}) { |k,h| h[k] = hash[k].values_at(*idx) }
)
  #=> {"id"=>"sjfdkjfd",
  #    "name"=>"Field Name",
  #    "type"=>"field",
  #    "options"=>["L", "Q", "A", "Q"],
  #    "option_ids"=>["12345", "23456", "45678", "cat"]}

如果“重复元素”被解释为意味着"options""option_ids" 的值只有上面显示的前三个元素,则计算idx 如下:

idx = hash["options"].
  each_with_index.
  uniq { |s,_| s }.
  map(&:last)
    #=> [0, 1, 4]

请参阅Enumerable#chunk_while(可以使用Enumerable#slice_when)和Array#values_at。步骤如下。

a = hash["options"]
  #=> ["L", "Q", "Q", "Q", "A", "A", "Q"] 
e0 = a.each_with_index
  #=> #<Enumerator: ["L", "Q", "Q", "Q", "A", "A", "Q"]:each_with_index> 
e1 = e0.chunk_while { |(a,_),(b,_)| a==b }
  #=> #<Enumerator: #<Enumerator::Generator:0x000055e4bcf17740>:each> 

我们可以看到枚举器e1 将生成的值并通过将其转换为数组传递给map

e1.to_a
  #=> [[["L", 0]],
  #    [["Q", 1], ["Q", 2], ["Q", 3]],
  #    [["A", 4], ["A", 5]], [["Q", 6]]] 

继续,

idx = e1.map { |(_,i),*| i }
  #=> [0, 1, 4, 6] 

c = ["options", "option_ids"].
      each_with_object({}) { |k,h| h[k] = hash[k].values_at(*idx) } 
  #=> {"options"=>["L", "Q", "A", "Q"],
  #    "option_ids"=>["12345", "23456", "45678", "cat"]} 
hash.merge(c)
  #=> {"id"=>"sjfdkjfd",
  #    "name"=>"Field Name",
  #    "type"=>"field",
  #    "options"=>["L", "Q", "A", "Q"],
  #    "option_ids"=>["12345", "23456", "45678", "cat"]}

【讨论】:

    【解决方案3】:

    使用Array#transpose

    hash = {
      "options" => ["Language", "Question", "Question", "Answer", "Answer"],
      "option_ids" => ["12345", "23456", "34567", "45678", "56789"]
    }
    
    hash.values.transpose.uniq(&:first).transpose.map.with_index {|v,i| [hash.keys[i], v]}.to_h
    #=> {"options"=>["Language", "Question", "Answer"], "option_ids"=>["12345", "23456", "45678"]}
    

    OP 编辑​​后:

    hash = {
      "id" => "sjfdkjfd",
      "name" => "Field Name",
      "type" => "field",
      "options" => ["Language", "Question", "Question", "Answer", "Answer"],
      "option_ids" => ["12345", "23456", "34567", "45678", "56789"]
    }
    
    hash_array = hash.to_a.select {|v| v.last.is_a?(Array)}.transpose
    hash.merge([hash_array.first].push(hash_array.last.transpose.uniq(&:first).transpose).transpose.to_h)
    #=> {"id"=>"sjfdkjfd", "name"=>"Field Name", "type"=>"field", "options"=>["Language", "Question", "Answer"], "option_ids"=>["12345", "23456", "45678"]}
    

    【讨论】:

      猜你喜欢
      • 2011-12-13
      • 1970-01-01
      • 2017-08-08
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2011-04-28
      相关资源
      最近更新 更多