【问题标题】:Using next() to search objects is extremly slow in Python在 Python 中使用 next() 搜索对象非常慢
【发布时间】:2021-10-04 10:58:17
【问题描述】:

让我们假设这个例子有两个类:

from timeit import Timer
from random import randint
import numpy as np

# Python version 3.8.3

class Value:
    def __init__(self, val):
        self.val = val
class Object:
    def __init__(self):
        self.d = dict()
        self.counter = 0
    def add_obj(self, obj):
        self.d[self.counter] = obj
        self.counter +=1
    def return_val(self):
        return self.d

obj = Object()
for _ in range(100000):
    name = chr(np.random.randint(ord('a'), ord('z')))
    info = Value(name)
    obj.add_obj(info)

mydict = obj.return_val()

现在,我只想使用 next() 来搜索和检索 dict 中第一次出现的元素,否则返回 None。但是,我观察到 f1 和 f3 函数都比传统的 f2 方法慢得多。任何以更有效的方式执行此搜索的提示都值得赞赏。对于 f3 函数,我使用了此处提供的建议:Is next() in python really that fast?

def f1():
    matched_elem = next((elem for elem in mydict.values() if str(elem.val) == 'a'), None)
    return matched_elem

def f2():
    for elem in mydict.values():
        if  'a' == str(elem.val):
            return elem
    return None

def f3():
    matched_elem = (elem for elem in mydict.values() if str(elem.val) == 'a')

    if matched_elem is None:
        return None
    return next(matched_elem)
print(Timer(f1).timeit())
print(Timer(f2).timeit())
print(Timer(f3).timeit())
0.6597451590000001
0.397709414
0.682832415

【问题讨论】:

标签: python search next


【解决方案1】:

最初的问题是f2 在第一次比较后总是退出...... else: 子句在错误的位置。通过编辑,您将看到创建生成器对象、调用 next 并获得返回所需的额外开销。这是一个非常微不足道的差异,搜索匹配值所需的时间越长,它就会消失。 timeit 正在调用该函数一百万次,因此您的打印显示平均微秒。

我在循环中运行了 10 次代码,并计算了第一个“a”的距离,这就是结果。请注意,随着找到“a”所需的迭代次数增加,时间差几乎消失了。并且差异保持稳定 - 这就是生成器的开销。

10  t1: 1.909 μs, t2: 1.599 μs, diff: 0.310 pct: 83
2   t1: 0.887 μs, t2: 0.574 μs, diff: 0.313 pct: 64
30  t1: 4.510 μs, t2: 4.241 μs, diff: 0.269 pct: 94
8   t1: 1.703 μs, t2: 1.389 μs, diff: 0.314 pct: 81
5   t1: 1.314 μs, t2: 0.992 μs, diff: 0.322 pct: 75
45  t1: 6.670 μs, t2: 6.450 μs, diff: 0.219 pct: 96
86  t1: 12.689 μs, t2: 12.517 μs, diff: 0.172 pct: 98
9   t1: 1.946 μs, t2: 1.592 μs, diff: 0.354 pct: 81
25  t1: 4.210 μs, t2: 3.873 μs, diff: 0.337 pct: 91
59  t1: 9.064 μs, t2: 8.627 μs, diff: 0.437 pct: 95

代码

from timeit import Timer, timeit
from random import randint
import numpy as np

class Value:
    def __init__(self, val):
        self.val = val
class Object:
    def __init__(self):
        self.d = dict()
        self.counter = 0
    def add_obj(self, obj):
        self.d[self.counter] = obj
        self.counter +=1
    def return_val(self):
        return self.d

def f1():
    matched_elem = next((elem for elem in mydict.values() if str(elem.val) == 'a'), None)
    return matched_elem

def f2():
    for elem in mydict.values():
        if  'a' == str(elem.val):
            return elem
    return None

def countit():
    for i, elem in enumerate(mydict.values()):
        if str(elem.val) == 'a':
            return i

for _ in range(10):

    obj = Object()
    for _ in range(99999):
        name = chr(np.random.randint(ord('a'), ord('z')))
        info = Value(name)
        obj.add_obj(info)
    mydict = obj.return_val()

    c = countit()
    t1 = Timer(f1).timeit()
    t2 = Timer(f2).timeit()
    print(f"{c:< 4} t1: {t1:2.3f} μs, t2: {t2:2.3f} μs, diff: {t1-t2:2.3f} pct: {int(t2/t1*100)}")

【讨论】:

    猜你喜欢
    • 2022-11-12
    • 1970-01-01
    • 2016-03-12
    • 2016-08-20
    • 1970-01-01
    • 2017-06-27
    • 2018-03-19
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多