【发布时间】:2013-08-03 13:53:57
【问题描述】:
我有这两种实现来计算有限生成器的长度,同时保留数据以供进一步处理:
def count_generator1(generator):
'''- build a list with the generator data
- get the length of the data
- return both the length and the original data (in a list)
WARNING: the memory use is unbounded, and infinite generators will block this'''
l = list(generator)
return len(l), l
def count_generator2(generator):
'''- get two generators from the original generator
- get the length of the data from one of them
- return both the length and the original data, as returned by tee
WARNING: tee can use up an unbounded amount of memory, and infinite generators will block this'''
for_length, saved = itertools.tee(generator, 2)
return sum(1 for _ in for_length), saved
两者都有缺点,都可以胜任。有人可以对它们发表评论,甚至提供更好的选择吗?
【问题讨论】:
-
如果不消耗整个东西,就无法知道可迭代生成器的长度。
-
我知道。这不是问题
-
注意:如果您不需要精确的长度,那么您可以使用
operator.length_hint()(Python 3.4+) 来返回估计长度而不消耗迭代器。见PEP 424 - A method for exposing a length hint -
@J.F.Sebastian 这是 3.4 的一个很好的补充
-
@gonvaled:length_hint 将调用 __length_hint__(),这在生成器上实现起来很棘手。