如何检查我的数组中的所有元素是否出现不超过两次？答案

【问题标题】：How do I check if all the elements in my array occur no more than twice?如何检查我的数组中的所有元素是否出现不超过两次？
【发布时间】：2017-01-14 00:49:36
【问题描述】：

我使用的是 Ruby 2.4。假设我有一个字符串数组（它们都只是 string-i-fied（那是一个词吗？）整数 ...

["1", "2", "5", "25", "5"]

如何编写一个函数来告诉我数组中的所有元素是否在数组中出现的次数不超过两次？比如这个数组

["1", "3", "3", "55", "3", "2"]

将返回 false，因为 "3" 出现了 3 次，但是这个数组

["20", "10", "20", "10"]

将返回true，因为没有一个元素出现超过两次。

【问题讨论】：

标签： arrays ruby

【解决方案1】：

Enumerable#group_by 将为此承担重任：

def no_element_present_more_than_twice?(a)   
  a.group_by(&:itself).none? do |_key, values|
    values.count > 2
  end
end

p no_element_present_more_than_twice?(["1", "3", "3", "55", "3", "2"])
# => false
p no_element_present_more_than_twice?(["20", "10", "20", "10"])

【讨论】：

_key 或 _??
@sagarpandya82 Plain _ 将是标准的。我更喜欢留下名字，以便我知道我正在忽略它，但添加 _ 前缀以明确表示它被忽略了。

【解决方案2】：

您可以这样确定频率：

frequency = array.reduce(Hash.new(0)) do |counts, value|
  counts[value] += 1
  counts
end
# => { "1" => 1, "3" => 3, "55" => 1, "2" => 1 }

您可以像这样检查它们中的任何一个是否出现超过两次：

frequency.values.max > 2

如果你想把它包装得很好，你可以将它添加到 Enumerable 中：

module Enumerable
  def frequency
    f = Hash.new(0)
    each { |v| f[v] += 1 }
    f
  end
end

然后你的条件很简单：

array.frequency.values.max > 2

注意：这是Facets 的一部分。

【讨论】：

each_with_object 通常不像reduce 那样笨拙，因为reduce 你必须故意将哈希踢到下一轮。
@tadman 老实说，我忘记了它现在是 Enumerable 的一部分。尽管如此，我认为在适当的时候让人们接触map 和reduce 以及它们的灵活性是很棒的。
哦，当然！只是reduce 和inject 已经让位给了其中一些更特殊用途的方法。有趣的是，所有其他 Enumerable 方法都可以使用 inject 实现，它是该库的基本构建块。
好。或者，array.each_with_object(Hash.new(0)) { |counts, value| return false if (counts[value] += 1) > 2 }; true（或== 3）。

【解决方案3】：

我已亲自为您对所有选项进行基准测试 :)

Running each test 1024 times. Test will take about 34 seconds.
_akuhn is faster than _vlasiak by 16x ± 1.0
_vlasiak is faster than _wayne by 3.5x ± 0.1
_wayne is faster than _cary by 10.0% ± 1.0%
_cary is faster than _oneneptune by 10.09% ± 1.0%
_oneneptune is similar to _coreyward
_coreyward is faster than _tadman by 10.0% ± 1.0%
_tadman is faster than _sagarpandya82 by 10.0% ± 1.0%
_sagarpandya82 is faster than _glykyo by 80.0% ± 1.0%

如您所见，@akuhn 的答案比其他算法表现得更好，因为一旦找到匹配项，它就会提前退出。

注意：我编辑了答案以产生相同的结果，但没有编辑其中任何一个进行优化。

这是生成基准测试的脚本：

require 'fruity'

arr = Array.new(1000) { |seed|
  # seed is used to create the same array on each script run,
  # hence the same benchmark results will be produced
  Random.new(seed).rand(1..10).to_s
}

class Array
  def difference(other)
    h = other.each_with_object(Hash.new(0)) { |e,h| h[e] += 1 }
    reject { |e| h[e] > 0 && h[e] -= 1 }
  end
end

compare do
  _coreyward do
    arr.reduce(Hash.new(0)) { |counts, value|
      counts[value] += 1
      counts
    }.max[1] <= 2
  end

  _wayne do
    arr.group_by(&:itself).none? do |_key, values|
      values.count > 2
    end
  end

  _sagarpandya82 do
    arr.sort_by(&:to_i).each_cons(3).none? { |a,b,c| a == b && b == c }
  end

  _tadman do
    arr.sort.slice_when { |a,b| a != b }.map(&:length).max.to_i <= 2
  end

  _cary do
    arr.difference(arr.uniq*2).empty?
  end

  _akuhn do
    count = Hash.new(0)
    arr.none? { |each| (count[each] += 1) > 2 }
  end

  _oneneptune do
    arr.each_with_object(Hash.new(0)) { |element,counts|
      counts[element] += 1
    }.values.max < 3
  end

  _glykyo do
    arr.uniq.map{ |element| arr.count(element) }.max <= 2
  end

  _vlasiak do
    arr.none? { |el| arr.count(el) > 2 }
  end

end

【讨论】：

这是 _akuhn 的最佳情况，因为它在 20 个左右的元素后停止，而其他元素继续迭代 1000 个元素。有趣的是，对于(1..1000).to_a * 2，_akuhn 仍然是最快的，尽管“只有”10%。
巧合的是，计算效率前三名也是我发现最有趣的三个算法。

【解决方案4】：

试试这个

count = Hash.new(0)
array.none? { |each| (count[each] += 1) > 2 }
# => true or false

这是如何工作的？

Hash.new(0) 创建一个默认值为 0 的哈希
none? 检查所有元素的块并返回是否没有元素匹配
count[each] += 1 增加计数（没有nil 的情况，因为默认值为0）

这是一个最佳解决方案，因为它会在找到第一个违规元素时立即中断。此处发布的所有其他解决方案要么扫描整个阵列，要么具有更糟糕的复杂性。

注意，如果您想知道哪些元素出现了两次以上（例如打印错误消息），请使用 find 或 find_all 而不是 none?。

【讨论】：

不错。它的速度更快，即使对于(1..1000).to_a * 2 也是如此。看来您很快就会获得当之无愧的红宝石银质徽章！

【解决方案5】：

这是另一种方式，使用方法Array#difference：

def twice_at_most?(arr)
  arr.difference(arr.uniq*2).empty?
end

其中Array#difference定义如下：

class Array
  def difference(other)
    h = other.each_with_object(Hash.new(0)) { |e,h| h[e] += 1 }
    reject { |e| h[e] > 0 && h[e] -= 1 }
  end
end

在找到Array#difference 的许多用途后，我将proposed that it be adopted 作为核心方法。此链接中的文档解释了该方法的工作原理并提供了使用示例。

让我们试试吧。

twice_at_most? [1, 4, 2, 4, 1, 3, 4]
  #=> false

这里

arr.uniq*2
  #=> [1, 4, 2, 3, 1, 4, 2, 3] 
arr.difference(arr.uniq*2)
  #=> [4]

另一个例子：

twice_at_most? [1, 4, 2, 4, 1, 3, 5]
  #=> true

【讨论】：

【解决方案6】：

在我看来，这可能是一个非常简单的解决方案：

def no_more_than_twice_occur?(array)
  array.none? { |el| array.count(el) > 2 }
end


no_more_than_twice_occur?(["1", "3", "3", "55", "3", "2"]) # => false
no_more_than_twice_occur?(["20", "10", "20", "10"]) # => true

【讨论】：

【解决方案7】：

为避免大量临时开销，只需 sort 数组，然后将其拆分为相似元素的块。然后你可以找到最长的块：

def max_count(arr)
  arr.sort.slice_when { |a,b| a != b }.map(&:length).max.to_i
end

max_count(%w[ 1 3 3 55 3 2 ])
# => 3

max_count(%w[ 1 3 55 3 2 ])
# => 2

max_count([ ])
# => 0

【讨论】：

我认为你不需要to_i。
Sort 是O(n log n)，它会引入开销而不是避免开销。
@CarySwoveland .to_i 处理空箱，否则您将得到nil。
啊，[].max #=> nil。 .max || 0 可能更适合自我记录。
@CarySwoveland “给我一个数字，伙计！我不在乎它是否为零。”这就是我阅读.to_i 的方式。不过，也许or 0 会有所帮助。

【解决方案8】：

只是为了好玩，这是使用 each_cons 和使用 none? 的一种方式，正如 Wayne Conrad 在他的回答中所使用的那样。

 arr.sort_by(&:to_i).each_cons(3).none? { |a,b,c| a == b && b == c }

【讨论】：

也许提到 O(n log n) 复杂性来解释为什么这很有趣？

【解决方案9】：

这里有适合您的多合一方法。

def lessThanThree(arr)
  arr.each_with_object(Hash.new(0)) { |element,counts| counts[element] += 1 }.values.max < 3
end

基本上，获取数组，迭代创建哈希并计算每次出现的次数，然后 values 方法只生成一个包含所有计数（值）的数组，然后 max 找到最大值。我们检查是否小于 3，如果是，则返回 true，否则返回 false。您可以用代码块替换 true 或 false。

【讨论】：

三元不是多余的吗？ x < 3 ? true : false 与 x < 3 完全相同
@Glyoko 它是 - 我在末尾添加了这句话来解释他们可以用代码块替换它，因为我认为他们会做其他事情......我的错误，将修复这个

【解决方案10】：

对于数组中的每个唯一项，计算该元素在数组中出现的次数。在这些值中，检查最大值是否为

def max_occurence_at_most_2?(array)
  array.uniq.map{ |element| array.count(element) }.max <= 2
end

未针对速度进行优化。

【讨论】：

这是一个很好的答案 - 但是您可以进行一次编辑以使其符合原始问题是
可能有效，但从技术上讲非常混乱，因为数组越大，它的速度就越慢。
感谢@OneNeptune。那是真正的 tadman，但 imo 以这种方式读起来更清楚一些。这是简单与速度的问题，取决于工具的要求。