检查列表是否有重复列表答案

【问题标题】：Checking if a list has duplicate lists检查列表是否有重复列表
【发布时间】：2017-06-08 03:48:09
【问题描述】：

给定一个列表列表，我想确保没有两个列表具有相同的值和顺序。例如my_list = [[1, 2, 4, 6, 10], [12, 33, 81, 95, 110], [1, 2, 4, 6, 10]] 它应该返回我重复列表的存在，即[1, 2, 4, 6, 10]。

我使用了while，但它没有按我的意愿工作。有人知道如何修复代码：

routes = [[1, 2, 4, 6, 10], [1, 3, 8, 9, 10], [1, 2, 4, 6, 10]]
r = len(routes) - 1
i = 0
while r != 0:
    if cmp(routes[i], routes[i + 1]) == 0:
        print "Yes, they are duplicate lists!"
    r -= 1
    i += 1

【问题讨论】：

标签： python list duplicates

【解决方案1】：

您可以在列表推导中计算出现次数，将它们转换为 tuple，以便您可以散列并应用唯一性：

routes = [[1, 2, 4, 6, 10], [1, 3, 8, 9, 10], [1, 2, 4, 6, 10]]
dups = {tuple(x) for x in routes if routes.count(x)>1}

print(dups)

结果：

{(1, 2, 4, 6, 10)}

足够简单，但由于重复调用count，导致大量循环。还有另一种涉及散列但复杂度较低的方法是使用collections.Counter：

from collections import Counter

routes = [[1, 2, 4, 6, 10], [1, 3, 8, 9, 10], [1, 2, 4, 6, 10]]

c = Counter(map(tuple,routes))
dups = [k for k,v in c.items() if v>1]

print(dups)

结果：

[(1, 2, 4, 6, 10)]

（只需计算元组转换的子列表 - 修复哈希问题 - 并使用列表理解生成 dup 列表，仅保留出现多次的项目）

现在，如果你只是想检测有一些重复的列表（不打印它们），你可以

将列表列表转换为元组列表，以便您可以在集合中散列它们
比较列表的长度和集合的长度：

如果有一些重复，len 是不同的：

routes_tuple = [tuple(x) for x in routes]    
print(len(routes_tuple)!=len(set(routes_tuple)))

或者，能够在 Python 3 中使用 map 的情况非常少见，因此值得一提：

print(len(set(map(tuple,routes))) != len(routes))

【讨论】：

【解决方案2】：

routes = [[1, 2, 4, 6, 10], [1, 3, 8, 9, 10], [1, 2, 4, 6, 10]]
dups = set()

for route in routes:
    if tuple(route) in dups:
        print('%s is a duplicate route' % route)
    else:
        dups.add(tuple(route))

【讨论】：

【解决方案3】：

不确定您是否想要一个外部库，但我有一个包含为此目的明确创建的函数：iteration_utilities.duplicates

>>> from iteration_utilities import duplicates

>>> my_list = [[1, 2, 4, 6, 10], [12, 33, 81, 95, 110], [1, 2, 4, 6, 10]]

>>> list(duplicates(my_list, key=tuple))
[[1, 2, 4, 6, 10]]

请注意，这在没有 key=tuple 的情况下也可以工作，但会有 O(n*n) 行为而不是 O(n)。

>>> list(duplicates(my_list))
[[1, 2, 4, 6, 10]]

如果这很重要，它还会保持出现顺序（带或不带key）：

>>> list(duplicates([[1], [2], [3], [1], [2], [3]]))
[[1], [2], [3]]

如果您只感兴趣如果有重复项，您可以在其上使用any 而不是list：

>>> any(duplicates([[1], [2], [3], [1], [2], [3]]))
True
>>> any(duplicates([[1], [2], [3]]))
False

【讨论】：

【解决方案4】：

for x in routes:

    print x, routes.count(x)

这将返回每个列表以及它出现的次数。或者，您只能显示它们是否出现 > 1：

new_list = []

for x in routes:

    if routes.count(x)>1:

        if x not in new_list:

            new_list.append(x)

for x in new_list:

    print x, routes.count(x)

希望对你有帮助！

【讨论】：

【解决方案5】：

def duplicate(lst):
    cntrin=0
    cntrout=0
    for i in lst:
        cntrin=0
        for k in lst:
            if i==k:
                cntrin=cntrin+1
        if cntrin>1:
            cntrout=cntrout+1
    if cntrout>0:
        return True
    else:
        return False

享受吧！

【讨论】：

【解决方案6】：

您可以使用 numpy 库 中的unique() 函数以紧凑的方式解决它。正如numpy docs 中所写，您可以这样进行：

import numpy as np

unique_lists_in_list = np.unique(routes , axis=0)

然后您可以比较长度。如果它们的长度相同，则表示路由没有重复。

if len(unique_lists_in_list) == len(routes):
    print("All lists inside the routes list aren't duplicate")

【讨论】：

【解决方案7】：

使用不包含重复项的“向后”切片（它是保留顺序）进行列表理解

routes = [[1, 2, 4, 6, 10], [1, 3, 8, 9, 10], [1, 2, 4, 6, 10]]

routes_cleaned = [l for i, l in enumerate(routes) if l not in routes[:i]]

print(len(routes_cleaned) == len(route))
print(routes_cleaned)

输出

False
[[1, 2, 4, 6, 10], [1, 3, 8, 9, 10]]

【讨论】：