不,因为in 运算符首先计算字符串的散列(线性复杂度),然后如果集合中存在具有相同散列的元素(再次,线性复杂度),则进行另一次比较。所以,a == b 是您最好的选择,因为它跳过了哈希的计算并避免了集合的构造。
in 运算符的恒定时间复杂度与集合的长度有关。就字符串的长度而言,它仍然是线性的。
然而,下面的结果似乎不一致:
$ python3 -m timeit "'aaa' == 'aaa'"
20000000 loops, best of 5: 18.9 nsec per loop
$ python3 -m timeit "'aaa' == 'bbb'"
10000000 loops, best of 5: 22.1 nsec per loop
$ python3 -m timeit "'aaa' in {'aaa'}"
20000000 loops, best of 5: 17 nsec per loop
$ python3 -m timeit "'aaa' in {'bbb'}"
20000000 loops, best of 5: 16 nsec per loop
稍作改动的例子似乎更有意义...
$ python3 -m timeit "a = 'aaa'; b = 'bbb'; a in {b}"
5000000 loops, best of 5: 63 nsec per loop
$ python3 -m timeit "a = 'aaa'; b = 'bbb'; a in {a}"
5000000 loops, best of 5: 55.7 nsec per loop
$ python3 -m timeit "a = 'aaa'; b = 'bbb'; a == b"
10000000 loops, best of 5: 30.1 nsec per loop
$ python3 -m timeit "a = 'aaa'; b = 'bbb'; a == a"
10000000 loops, best of 5: 27 nsec per loop
可以在反汇编代码中找到原因:
from dis import dis
def a():
'aaa' in {'aaa'}
def b():
x = 'aaa'
'aaa' in {x}
print(dis(a))
print(dis(b))
产量:
5 0 LOAD_CONST 1 ('aaa')
2 LOAD_CONST 2 (frozenset({'aaa'}))
4 COMPARE_OP 6 (in)
6 POP_TOP
8 LOAD_CONST 0 (None)
10 RETURN_VALUE
None
12 0 LOAD_CONST 1 ('aaa')
2 STORE_FAST 0 (x)
13 4 LOAD_CONST 1 ('aaa')
6 LOAD_FAST 0 (x)
8 BUILD_SET 1
10 COMPARE_OP 6 (in)
12 POP_TOP
14 LOAD_CONST 0 (None)
16 RETURN_VALUE
None
由于我们在第一种方法中只使用文字,python 可以进行其中一个(极其罕见的)优化并使用 LOAD_CONST 而不必在运行时构造集合。
然而,对于更真实的代码,情况不再如此,必须使用昂贵的 BUILD_SET。