这是一个解决问题的有趣方法,这是一个强大的函数,它返回一个生成器:
def combine_item_pairs(l1, l2):
D = {k:[v, False] for k, v in l1}
for key, value in l2:
if key in D:
D[key][1] = value
else:
D[key] = [False, value]
return (tuple([key]+value) for key, value in D.iteritems())
使用它:
>>> list(combine_item_pairs(list_a, list_b))
[('item_2', 'attribute_y', False), ('item_3', 'attribute_z', 'attribute_p'), ('item_1', 'attribute_x', 'attribute_n')]
这是一个额外的奖励解决方案(相同的界面,但更有效的解决方案:
from itertools import groupby
from operator import itemgetter as I
def combine_item_pairs(l1, l2):
return (tuple(list([k]+[I(1)(i) for i in g]+[False])[:3]) for k, g in groupby(sorted(l1+l2), key=I(0)))
结果:
>>> list(combine_item_pairs(list_a, list_b))
[('item_1', 'attribute_n', 'attribute_x'), ('item_2', 'attribute_y', False), ('item_3', 'attribute_p', 'attribute_z')]
注意:如果列表需要大量排序或缺少大量值,则此解决方案的效率会降低。 (此外,目前所有缺勤情况将仅在元组的最后一项中由 False 值反映,无法知道哪个列表缺少项目(这是效率的代价)这应该与大数据一起使用知道哪个列表缺少项目并不重要)
编辑:计时器:
a = [('item_1', 'attribute_x'), ('item_2', 'attribute_y'), ('item_3', 'attribute_z')]
b = [('item_1', 'attribute_n'), ('item_3', 'attribute_p')]
def inbar(l1, l2):
D = {k:[v, False] for k, v in l1}
for key, value in l2:
if key in D:
D[key][1] = value
else:
D[key] = [False, value]
return (tuple([key]+value) for key, value in D.iteritems())
def solus(l1, l2):
dict_a,dict_b = dict(l1), dict(l2)
items = sorted({i for i,_ in l1+l2})
return [(i, dict_a.get(i,False), dict_b.get(i,False)) for i in items]
import timeit # running each timer 3 times just to be sure.
print timeit.Timer('inbar(a, b)', 'from __main__ import a, b, inbar').repeat()
# [2.2363221572247483, 2.1427426716407836, 2.1545361420851963]
# [2.2058199808040575, 2.137495707329387, 2.178640404817184]
# [2.4588094406466743, 2.4221991975274215, 2.3586636366037856]
print timeit.Timer('solus(a, b)', 'from __main__ import a, b, solus').repeat()
# [5.841498824468664, 5.951693880486182, 5.866254325691159]
# [5.843569212526087, 5.919173415087307, 6.027018876010061]
# [6.41402184345621, 6.229860036924308, 6.562849100520403]