数组的最长公共前缀和后缀答案

【问题标题】：Longest common prefix and suffix of arrays数组的最长公共前缀和后缀
【发布时间】：2014-10-13 21:25:28
【问题描述】：

获取两个数组的最长公共前缀（从原始索引 0 开始的子数组）和后缀（以原始索引 -1 结尾的子数组）的最佳方法是什么？例如，给定两个数组：

[:foo, 1, :foo, 0, nil, :bar, "baz", false]
[:foo, 1, :foo, 0, true, :bar, false]

这些数组的最长公共前缀是：

[:foo, 1, :foo, 0]

这些数组的最长公共后缀是：

[false]

当原始数组中索引 0/-1 处的元素不同时，公共前缀/后缀应为空数组。

【问题讨论】：

你的意思是“后缀”而不是“词缀”（后缀是词缀，但你使用它的方式暗示后缀）？
@matt 对。这是我的错。

标签： ruby arrays longest-prefix

【解决方案1】：

一种可能的解决方案：

a1 = [:foo, 1, 0, nil, :bar, "baz", false]
a2 = [:foo, 1, 0, true, :bar, false]

a1.zip(a2).take_while { |x, y| x == y }.map(&:first)
#=> [:foo, 1, 0]

把输入数组和输出数组倒过来，找到一个共同的后缀：

a1.reverse.zip(a2.reverse).take_while { |x, y| x == y }.map(&:first).reverse
#=> [false]

边缘情况：zip 用 nil 值填充“参数”数组：

a1 = [true, nil, nil]
a2 = [true]

a1.zip(a2).take_while { |x, y| x == y }.map(&:first)
#=> [true, nil, nil]

这可以通过将初始数组截断为第二个数组的长度来避免：

a1[0...a2.size].zip(a2).take_while { |x, y| x == y }.map(&:first)

【讨论】：

有没有办法克服边缘情况？如果没有nil，则可以申请compact。
极端情况不太可能发生。仅当前缀“是”整个数组的长度时才会发生（没有后缀）。我想它仍然可能发生，但不太可能。边缘情况的真正问题是词缀情况。
一种方法是取数组的最小长度，并在 dong zip 之前剪掉另一端。
@sawa 是的，a1[0...a2.size].zip(a2[0...a1.size]) 可以作为“最小拉链”使用
@sawa 更简单 - zip 不会扩展初始数组，我已经更新了答案

【解决方案2】：

另一种没有边缘情况的解决方案：

a1 = [:foo, 1, 0, nil, :bar, "baz", false]
a2 = [:foo, 1, 0, true, :bar, false]

a1.take_while.with_index {|v,i| a2.size > i && v == a2[i] }

编辑：提高性能

class Array
  def common_prefix_with(other)
    other_size = other.size
    take_while.with_index {|v,i| other_size > i && v == other[i] }
  end
end

a1.common_prefix_with a2

【讨论】：

可以，但每次都会计算大小
@Stefan 我曾经向塞尔吉奥提出同样的要求。但他告诉我size（或length）是一种非常便宜的方法，放在迭代中也没有坏处。
@sawa 你说得对，只需要查长度，不需要计算
BroiSatse 的第二个答案似乎是所有答案中最快的。

【解决方案3】：

三种解决方案：粗野（#3，初始答案）、更好（#2，第一次编辑）和最佳（#1，@Stefan 答案的变体，第二次编辑）。

a = [:foo, 1, :foo, 0, nil, :bar, "baz", false]
b = [:foo, 1, :foo, 0, true, "baz", false]

c = [:foo,1,:goo]
d = [:goo,1,:new]

注意b 与 OP 的示例略有不同。

除非下文另有说明，否则将通过反转数组、应用 common_prefix 然后反转结果来计算公共后缀。

#1

Stefan 答案的变体，去掉了 zip 和 map（并保留了他将一个数组截断为最多另一个数组长度的技术）：

def common_prefix(a,b)
  a[0,b.size].take_while.with_index { |e,i| e == b[i] }
end

common_prefix(a,b)
  #=> [:foo, 1, :foo, 0]
common_prefix(c,d)
  #=> []

#2

def common_prefix(a,b)
  any, arr = a.zip(b).chunk { |e,f| e==f }.first
  any ? arr.map(&:first) : []
end

def common_suffix(a,b)
  any, arr = a[a.size-b.size..-1].zip(b).chunk { |e,f| e==f }.to_a.last
  any ? arr.map(&:first) : []
end

common_prefix(a,b)
  #=> [:foo, 1, :foo, 0]
  # Nore: any, arr = a.zip(b).chunk { |e,f| e==f }.to_a
  #  => [[true, [[:foo, :foo], [1, 1], [:foo, :foo], [0, 0]]],
  #      [false, [[nil, true], [:bar, :baz], ["baz", false], [false, nil]]]]

common_suffix(a,b)
  #=> ["baz", false]
  # Note: any, arr = a[a.size-b.size..-1].zip(b).chunk { |e,f| e==f }.to_a
  #  => [[false, [[1, :foo], [:foo, 1], [0, :foo], [nil, 0], [:bar, true]]],
  #      [true, [["baz", "baz"], [false, false]]]]

当:first 被发送到枚举器Enumerable#chunk 时，枚举器的第一个元素被返回。因此，它的效率应该与使用 Enumerable#take_while 相当。

common_prefix(c,d)
  #=> []
common_suffix(c,d)
  #=> []

#3

def common_prefix(a,b)
  a[0,(0..[a.size, b.size].min).max_by { |n| (a[0,n]==b[0,n]) ? n : -1 }]
end

common_prefix(a,b)
  #=> [:foo, 1, :foo, 0]

common_prefix(c,d)
  #=> []

【讨论】：

【解决方案4】：

先生们，启动你们的引擎！

测试方法

我测试了@stefan 给出的第二种方法，@BroiSatse 提出的两种方法和我提供的三种方法。

class Array #for broiSatse2
  def common_prefix_with(other)
    other_size = other.size
    take_while.with_index {|v,i| other_size > i && v == other[i] }
  end
end

class Cars
  def self.stefan(a1,a2)
    a1[0...a2.size].zip(a2).take_while { |x, y| x == y }.map(&:first)
  end

  def self.broiSatse1(a1,a2)
    a1.take_while.with_index {|v,i| a2.size > i && v == a2[i] }
  end

  def self.broiSatse2(a1,a2)
    a1.common_prefix_with(a2)
  end

  def self.cary1(a,b)
    a[0,b.size].take_while.with_index { |e,i| e == b[i] }
  end

  def self.cary2(a,b)
    any, arr = a.zip(b).chunk { |e,f| e==f }.first
    any ? arr.map(&:first) : []
  end

  def self.cary3(a,b)
    a[0,(0..[a.size, b.size].min).max_by { |n| (a[0,n]==b[0,n]) ? n : -1 }]
  end
end

我没有包含@6ftDan 给出的解决方案（不要与 5' Dan 或 7' Dan 混淆），因为它没有通过所有测试。

测试数组的构造

random_arrays(n) 构造一对数组。 n 是两者中较小者的大小。较大的大小为n+1。

def random_arrays(n)
  m = rand(n)
  # make arrays the same for the first m elements
  a = Array.new(m,0)
  b = Array.new(m,0)
  if m < n
    # make the m+1 elements different
    a << 0
    b << 1
    # randomly assign 0s and 1a to the remaining elements
    (n-m-1).times { a << rand(2); b << rand(2) }  if m < n - 1
  end
  # make arrays unequal in length
  (rand(2) == 0) ? a << 0 : b << 0
  [a,b]
end

确认测试的方法给出相同的结果

N = 10000 #size of smaller of two input arrays
methods = Cars.methods(false)

# ensure are methods produce the same results and that
# all deal with edge cases properly
20.times do |i|
  test = case i
         when 0 then [[0,1],[1,1]]
         when 1 then [[0],[]]
         when 1 then [[0,0,0],[0,0]]
         else
         random_arrays(N)
         end  
  soln = Cars.send(methods.first, *test)
  methods[1..-1].each  do |m|
    unless soln == Cars.send(m, *test)
      puts "#{m} has incorrect solution for #{test}"
      exit
    end
  end
end

puts "All methods yield the same answers\n"

基准测试

require 'benchmark'

I = 1000 #number of array pairs to test

@arr = I.times.with_object([]) { |_,a| a << random_arrays(N) } #test arrays

#test method m 
def testit(m)
  I.times { |i| Cars.send(m, *@arr[i]) }
end    

Benchmark.bm(12) { |bm| methods.each { |m| bm.report(m) { testit(m) } } end

结果

All methods yield the same answers

                   user     system      total        real
stefan        11.260000   0.050000  11.310000 ( 11.351626)
broiSatse1     0.860000   0.000000   0.860000 (  0.872256)
broiSatse2     0.720000   0.010000   0.730000 (  0.717797)
cary1          0.680000   0.000000   0.680000 (  0.684847)
cary2         13.130000   0.040000  13.170000 ( 13.215581)
cary3         51.840000   0.120000  51.960000 ( 52.188477)

【讨论】：

看起来你的第一个方法比 broi 的还要快。很抱歉错过了。不过，我已经对你的答案投了赞成票。
著名的表达，"Gentlemen, start your engines!" 可以追溯到 20 世纪中叶，当时女性赛车手极为罕见。如果有女性提供这个问题的答案，我不会使用这个词。

【解决方案5】：

这适用于唯一数组。

前缀

x = [:foo, 1, 0, nil, :bar, "baz", false]
y = [:foo, 1, 0, true, :bar, false]
(x & y).select.with_index {|item,index| x.index(item) == index}

输出： => [:foo, 1, 0]

在 Arrays 上反向运行以获取 Affix

(x.reverse & y.reverse).select.with_index {|item,index| x.reverse.index(item) == index}

输出：=> [false]

【讨论】：

这不起作用，例如，对于返回 [:foo] 的x = [:foo, :foo]; y = [:foo, :foo]。
啊，对于重复项索引不是一个好的解决方案。所以我认为这些数组不是唯一的？