这是我的解决方案,从 OP 的角度来看,它接近于蛮力。它不受顺序的困扰(随机随机播放以确认),列表中可能存在不匹配的元素,以及其他独立的匹配项。假设重叠意味着不是一个适当的子集,而是独立的字符串,在开始和结束时具有共同的元素:
from collections import defaultdict
from random import choice, shuffle
def overlap(a, b):
""" get the maximum overlap of a & b plus where the overlap starts """
overlaps = []
for i in range(len(b)):
for j in range(len(a)):
if a.endswith(b[:i + 1], j):
overlaps.append((i, j))
return max(overlaps) if overlaps else (0, -1)
lst = ['SGALWDV', 'GALWDVP', 'ALWDVPS', 'LWDVPSP', 'WDVPSPV', 'NONSEQUITUR']
shuffle(lst) # to verify order doesn't matter
overlaps = defaultdict(list)
while len(lst) > 1:
overlaps.clear()
for a in lst:
for b in lst:
if a == b:
continue
amount, start = overlap(a, b)
overlaps[amount].append((start, a, b))
maximum = max(overlaps)
if maximum == 0:
break
start, a, b = choice(overlaps[maximum]) # pick one among equals
lst.remove(a)
lst.remove(b)
lst.append(a[:start] + b)
print(*lst)
输出
% python3 test.py
NONSEQUITUR SGALWDVPSPV
%
计算所有重叠并将最大的重叠组合成一个元素,替换原来的两个,然后重新开始处理,直到我们只剩下一个元素或没有重叠。
overlap() 函数效率极低,可能可以改进,但如果这不是匹配 OP 所需的类型,那也没关系。