查找唯一编号的代码缓慢且效率低下？答案

【问题标题】：Code to Find Unique Number Slow and Inefficient?查找唯一编号的代码缓慢且效率低下？
【发布时间】：2019-05-21 16:00:38
【问题描述】：

我最近正在处理关于 codewars 的代码问题，以在列表中查找唯一编号。我的代码有效，但是效率非常低。我不确定为什么会这样。以下是我发布的代码：

我认为问题可能在于我每次迭代时都在复制列表（也许）。

def find_uniq(arr):
    equal_check = 0
    for i in arr:
        arr_new = arr.copy()
        arr_new.remove(i)
        if i not in arr_new:
            equal_check = i
    return equal_check

【问题讨论】：

听起来像是 profiling 的工作。
list(set(seq)) 简单易行
顺便说一句，从副本中，如果你找到一个唯一的号码，你能早点回来吗？

标签： python python-3.x

【解决方案1】：

使用collections.Counter，得到计数为1的：

from collections import Counter 

def find_uniq(arr):
    c = Counter(arr)
    return [number for number,count in c.most_common() if count == 1]


print(find_uniq( [1,2,3,4,2,3,4,5,6,4,5,6,7,8,9])) # [1, 7, 8, 9]

这大约需要 O(2*n) 所以 O(n) 因为 2 是常数。

collection.defaultdict with int，得到计数为 1 的：

# defaultdict
from collections import Counter , defaultdict

def find_uniq(arr):
    c = defaultdict(int)
    for a in arr:
        c[a] += 1
    return [number for number,count in c.items() if count == 1]


print(find_uniq( [1,2,3,4,2,3,4,5,6,4,5,6,7,8,9])) # [1, 7, 8, 9]

这大约需要 O(2*n) 所以 O(n) 因为 2 是恒定的 - 由于实现内部的 C 优化，它比 Counter 稍微快一点（参见 f.e. Surprising results with Python timeit: Counter() vs defaultdict() vs dict()）。

normal dicts 和 setdefault 或 test/add，获取计数为 1 的：

# normal dict - setdefault
def find_uniq(arr):
    c = dict()
    for a in arr:
        c.setdefault(a,0)
        c[a] += 1
    return [number for number,count in c.items() if count == 1]


print(find_uniq( [1,2,3,4,2,3,4,5,6,4,5,6,7,8,9])) # [1, 7, 8, 9]

# normal dict - test and add 
def find_uniq(arr):
    c = dict()
    for a in arr:
        if a in c:
            c[a] += 1
        else:
            c[a] = 1

    return [number for number,count in c.items() if count == 1]


print(find_uniq( [1,2,3,4,2,3,4,5,6,4,5,6,7,8,9])) # [1, 7, 8, 9]

Setdefault 每次都会创建默认值 - 它比 Counter 或 defaultdict 慢，比使用 test/add 快。

itertools.groupby（需要排序列表！），获取计数为1的：

from itertools import groupby

def find_uniq(arr):
    return [k for (k,p) in groupby(sorted(arr)) if len(list(p)) == 1]

print(find_uniq( [1,2,3,4,2,3,4,5,6,4,5,6,7,8,9])) # [1, 7, 8, 9]

groupby 需要一个排序列表，单独的列表排序是 O(n * log n) 并且结合起来这比其他方法慢。

【讨论】：

非常感谢您的超详细回复！帮了大忙！