-
startswith 和 in,返回一个布尔值。
-
in 运算符是对成员资格的测试。
- 这可以通过
list-comprehension 或filter 执行。
- 使用
list-comprehension 和in 是测试过的最快实现。
- 如果大小写不是问题,请考虑将所有单词映射为小写。
-
l = list(map(str.lower, l))。
- 使用 python 3.10.0 测试
filter:
- 使用
filter 创建一个filter 对象,因此list() 用于显示list 中的所有匹配值。
l = ['ones', 'twos', 'threes']
wanted = 'three'
# using startswith
result = list(filter(lambda x: x.startswith(wanted), l))
# using in
result = list(filter(lambda x: wanted in x, l))
print(result)
[out]:
['threes']
list-comprehension
l = ['ones', 'twos', 'threes']
wanted = 'three'
# using startswith
result = [v for v in l if v.startswith(wanted)]
# using in
result = [v for v in l if wanted in v]
print(result)
[out]:
['threes']
哪个实现更快?
- 在 Jupyter 实验室中使用来自
nltk v3.6.5 的 words 语料库进行测试,该语料库有 236736 个单词
- 带有
'three'的单词
['three', 'threefold', 'threefolded', 'threefoldedness', 'threefoldly', 'threefoldness', 'threeling', 'threeness', 'threepence', 'threepenny', 'threepennyworth', 'threescore', 'threesome']
from nltk.corpus import words
%timeit list(filter(lambda x: x.startswith(wanted), words.words()))
[out]:
64.8 ms ± 856 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit list(filter(lambda x: wanted in x, words.words()))
[out]:
54.8 ms ± 528 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit [v for v in words.words() if v.startswith(wanted)]
[out]:
57.5 ms ± 634 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit [v for v in words.words() if wanted in v]
[out]:
50.2 ms ± 791 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)