【问题标题】:Merging overlapping strings合并重叠的字符串
【发布时间】:2020-04-19 18:03:00
【问题描述】:

假设我需要像这样合并两个重叠的字符串:

def mergeOverlap(s1: String, s2: String): String = ???

mergeOverlap("", "")       // ""
mergeOverlap("", "abc")    // abc  
mergeOverlap("xyz", "abc") // xyzabc 
mergeOverlap("xab", "abc") // xabc

我可以使用answer 为我之前的一个问题编写此函数:

def mergeOverlap(s1: String, s2: String): String = { 
  val n = s1.tails.find(tail => s2.startsWith(tail)).map(_.size).getOrElse(0)
  s1 ++ s2.drop(n)
}

您能否建议或者一个更简单的可能更有效的mergeOverlap实现?

【问题讨论】:

    标签: string scala collections


    【解决方案1】:

    您可以使用算法计算prefix function,在与字符串总长度成正比的时间上找到两个字符串之间的重叠O(n + k)。索引i处字符串的前缀函数定义为索引i处最长后缀的大小,等于整个字符串的前缀(不包括平凡的情况)。

    有关定义和计算它的算法的更多解释,请参阅这些链接:

    这里是一个修改算法的实现,它计算第二个参数的最长前缀,等于第一个参数的后缀:

    import scala.collection.mutable.ArrayBuffer
    
    def overlap(hasSuffix: String, hasPrefix: String): Int = {
      val overlaps = ArrayBuffer(0)
      for (suffixIndex <- hasSuffix.indices) {
        val currentCharacter = hasSuffix(suffixIndex)
        val currentOverlap = Iterator.iterate(overlaps.last)(overlap => overlaps(overlap - 1))
          .find(overlap =>
            overlap == 0 ||
            hasPrefix.lift(overlap).contains(currentCharacter))
          .getOrElse(0)
        val updatedOverlap = currentOverlap +
          (if (hasPrefix.lift(currentOverlap).contains(currentCharacter)) 1 else 0)
        overlaps += updatedOverlap
      }
      overlaps.last
    }
    

    mergeOverlap 只是

    def mergeOverlap(s1: String, s2: String) = 
      s1 ++ s2.drop(overlap(s1, s2))
    

    以及对这个实现的一些测试:

    scala> mergeOverlap("", "")    
    res0: String = ""
    
    scala> mergeOverlap("abc", "")
    res1: String = abc
    
    scala> mergeOverlap("", "abc")
    res2: String = abc
    
    scala> mergeOverlap("xyz", "abc")
    res3: String = xyzabc
    
    scala> mergeOverlap("xab", "abc")
    res4: String = xabc
    
    scala> mergeOverlap("aabaaab", "aab")
    res5: String = aabaaab
    
    scala> mergeOverlap("aabaaab", "aabc")
    res6: String = aabaaabc
    
    scala> mergeOverlap("aabaaab", "bc")
    res7: String = aabaaabc
    
    scala> mergeOverlap("aabaaab", "bbc")
    res8: String = aabaaabbc
    
    scala> mergeOverlap("ababab", "ababc")
    res9: String = abababc
    
    scala> mergeOverlap("ababab", "babc")
    res10: String = abababc
    
    scala> mergeOverlap("abab", "aab")
    res11: String = ababaab
    

    【讨论】:

    • 非常感谢。解决方案并不简单,但绝对非常有效。看起来很有趣。我将学习这个算法并尝试自己实现它。
    • 不幸的是我很难理解updatedOverlap的计算方法:((问了一个新问题(stackoverflow.com/questions/63923861/…)。
    【解决方案2】:

    它不是尾递归,但它是一个非常简单的算法。

    def mergeOverlap(s1: String, s2: String): String =
      if (s2 startsWith s1) s2
      else s1.head +: mergeOverlap(s1.tail, s2)
    

    【讨论】:

    • 是的,但是效率很低(这是 OP 要求的)
    • 我询问了更简单或更高效的实现。将更新问题以强调它。
    猜你喜欢
    • 2018-04-30
    • 1970-01-01
    • 2019-03-02
    • 2023-03-09
    • 1970-01-01
    • 1970-01-01
    • 2019-02-07
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多