列表推导式显然是最直接和最容易记住的——除了非常pythonic!
无论如何,在提出的解决方案中,它并不是最快的(我已经使用 Python 3.8.3 在 Windows 上运行了测试):
import timeit
from itertools import compress
import random
from operator import itemgetter
import pandas as pd
__N_TESTS__ = 10_000
vector = [str(x) for x in range(100)]
filter_indeces = sorted(random.sample(range(100), 10))
filter_boolean = random.choices([True, False], k=100)
# Different ways for selecting elements given indeces
# list comprehension
def f1(v, f):
return [v[i] for i in filter_indeces]
# itemgetter
def f2(v, f):
return itemgetter(*f)(v)
# using pandas.Series
# this is immensely slow
def f3(v, f):
return list(pd.Series(v)[f])
# using map and __getitem__
def f4(v, f):
return list(map(v.__getitem__, f))
# using enumerate!
def f5(v, f):
return [x for i, x in enumerate(v) if i in f]
# using numpy array
def f6(v, f):
return list(np.array(v)[f])
print("{:30s}:{:f} secs".format("List comprehension", timeit.timeit(lambda:f1(vector, filter_indeces), number=__N_TESTS__)))
print("{:30s}:{:f} secs".format("Operator.itemgetter", timeit.timeit(lambda:f2(vector, filter_indeces), number=__N_TESTS__)))
print("{:30s}:{:f} secs".format("Using Pandas series", timeit.timeit(lambda:f3(vector, filter_indeces), number=__N_TESTS__)))
print("{:30s}:{:f} secs".format("Using map and __getitem__", timeit.timeit(lambda: f4(vector, filter_indeces), number=__N_TESTS__)))
print("{:30s}:{:f} secs".format("Enumeration (Why anyway?)", timeit.timeit(lambda: f5(vector, filter_indeces), number=__N_TESTS__)))
我的结果是:
列表理解:0.007113 秒
Operator.itemgetter :0.003247 秒
使用 Pandas 系列:2.977286 秒
使用地图和 getitem:0.005029 秒
枚举(为什么?):0.135156 秒
Numpy:0.157018 秒