【问题标题】:Joining Two List Of Dictionaries With Same Values And Different Keys加入两个具有相同值和不同键的字典列表
【发布时间】:2019-12-26 01:15:38
【问题描述】:

对于一个需要在没有 pandas 或 numpy 的情况下解决的问题,我需要帮助。我有两个字典列表,即 list1 和 list2。我需要按“post_code”对 list2 进行排序,然后将其分组 e 在加入 list1 和 list2 之前按“代码”对 list2 进行排序,然后按两个具有相同值的不同键。在 list1 中,键“practice”等价于排序后的 list2 中的键“code”。我需要通过“实践”和“代码”的等效键加入list1和list2。

list1=
[{'bnf_code': '0101010G0AAABAB',
  'items': 2,
  'practice': 'N81013',
  'bnf_name': 'Co-Magaldrox_Susp 195mg/220mg/5ml S/F',
  'nic': 5.98,
  'act_cost': 5.56,
  'quantity': 1000},
 {'bnf_code': '0101021B0AAAHAH',
  'items': 1,
  'practice': 'A81001',
  'bnf_name': 'Alginate_Raft-Forming Oral Susp S/F',
  'nic': 1.95,
  'act_cost': 1.82,
  'quantity': 500},
 {'bnf_code': '0101021B0AAALAL',
  'items': 12,
  'practice': 'A81002',
  'bnf_name': 'Sod Algin/Pot Bicarb_Susp S/F',
  'nic': 64.51,
  'act_cost': 59.95,
  'quantity': 6300},
 {'bnf_code': '0101021B0AAAPAP',
  'items': 3,
  'practice': 'A81004',
  'bnf_name': 'Sod Alginate/Pot Bicarb_Tab Chble 500mg',
  'nic': 9.21,
  'act_cost': 8.55,
  'quantity': 180},
 {'bnf_code': '0101021B0BEADAJ',
  'items': 6,
  'practice': 'A81003',
  'bnf_name': 'Gaviscon Infant_Sach 2g (Dual Pack) S/F',
  'nic': 28.92,
  'act_cost': 26.84,
  'quantity': 90}]

list2=
[{'code': 'A81001',
  'name': 'THE DENSHAM SURGERY',
  'addr_1': 'THE HEALTH CENTRE',
  'addr_2': 'LAWSON STREET',
  'borough': 'STOCKTON ON TEES',
  'village': 'CLEVELAND',
  'post_code': 'TS18 1HU'},
 {'code': 'A81002',
  'name': 'QUEENS PARK MEDICAL CENTRE',
  'addr_1': 'QUEENS PARK MEDICAL CTR',
  'addr_2': 'FARRER STREET',
  'borough': 'STOCKTON ON TEES',
  'village': 'CLEVELAND',
  'post_code': 'TS18 2AW'},
 {'code': 'A81003',
  'name': 'VICTORIA MEDICAL PRACTICE',
  'addr_1': 'THE HEALTH CENTRE',
  'addr_2': 'VICTORIA ROAD',
  'borough': 'HARTLEPOOL',
  'village': 'CLEVELAND',
  'post_code': 'TS26 8DB'},
 {'code': 'A81004',
  'name': 'WOODLANDS ROAD SURGERY',
  'addr_1': '6 WOODLANDS ROAD',
  'addr_2': None,
  'borough': 'MIDDLESBROUGH',
  'village': 'CLEVELAND',
  'post_code': 'TS1 3BE'},
 {'code': 'N81013',
  'name': 'SPRINGWOOD SURGERY',
  'addr_1': 'SPRINGWOOD SURGERY',
  'addr_2': 'RECTORY LANE',
  'borough': 'GUISBOROUGH',
  'village': None,
  'post_code': 'TS14 7DJ'}]

我已经能够按 post_code 对 list2 进行排序并按 code 分组,但我不知道如何加入 list1 和 list2。这是我目前用于排序和分组的代码。

import itertools
from operator import itemgetter
sorted_post_code = sorted(list2, key=itemgetter('post_code'))
for key, group in itertools.groupby(sorted_post_code, key=lambda x:x['code']):
    #print (key),
    print (list(group))

预期输出是

joined_list=
list1=
[{'bnf_code': '0101010G0AAABAB',
  'items': 2,
  'practice': 'N81013',
  'bnf_name': 'Co-Magaldrox_Susp 195mg/220mg/5ml S/F',
  'nic': 5.98,
  'act_cost': 5.56,
  'quantity': 1000,
  'code': 'N81013',
  'name': 'SPRINGWOOD SURGERY',
  'addr_1': 'SPRINGWOOD SURGERY',
  'addr_2': 'RECTORY LANE',
  'borough': 'GUISBOROUGH',
  'village': None,
  'post_code': 'TS14 7DJ'},
 {'bnf_code': '0101021B0AAAHAH',
  'items': 1,
  'practice': 'A81001',
  'bnf_name': 'Alginate_Raft-Forming Oral Susp S/F',
  'nic': 1.95,
  'act_cost': 1.82,
  'quantity': 500,
  'code': 'A81001',
  'name': 'THE DENSHAM SURGERY',
  'addr_1': 'THE HEALTH CENTRE',
  'addr_2': 'LAWSON STREET',
  'borough': 'STOCKTON ON TEES',
  'village': 'CLEVELAND',
  'post_code': 'TS18 1HU'},
 {'bnf_code': '0101021B0AAALAL',
  'items': 12,
  'practice': 'A81002',
  'bnf_name': 'Sod Algin/Pot Bicarb_Susp S/F',
  'nic': 64.51,
  'act_cost': 59.95,
  'quantity': 6300,
  'code': 'A81002',
  'name': 'QUEENS PARK MEDICAL CENTRE',
  'addr_1': 'QUEENS PARK MEDICAL CTR',
  'addr_2': 'FARRER STREET',
  'borough': 'STOCKTON ON TEES',
  'village': 'CLEVELAND',
  'post_code': 'TS18 2AW'},
 {'bnf_code': '0101021B0AAAPAP',
  'items': 3,
  'practice': 'A81004',
  'bnf_name': 'Sod Alginate/Pot Bicarb_Tab Chble 500mg',
  'nic': 9.21,
  'act_cost': 8.55,
  'quantity': 180,
  'code': 'A81004',
  'name': 'WOODLANDS ROAD SURGERY',
  'addr_1': '6 WOODLANDS ROAD',
  'addr_2': None,
  'borough': 'MIDDLESBROUGH',
  'village': 'CLEVELAND',
  'post_code': 'TS1 3BE'},
 {'bnf_code': '0101021B0BEADAJ',
  'items': 6,
  'practice': 'A81003',
  'bnf_name': 'Gaviscon Infant_Sach 2g (Dual Pack) S/F',
  'nic': 28.92,
  'act_cost': 26.84,
  'quantity': 90,
  'code': 'A81003',
  'name': 'VICTORIA MEDICAL PRACTICE',
  'addr_1': 'THE HEALTH CENTRE',
  'addr_2': 'VICTORIA ROAD',
  'borough': 'HARTLEPOOL',
  'village': 'CLEVELAND',
  'post_code': 'TS26 8DB'}]

【问题讨论】:

  • “加入”list1 和 2 是什么意思?请显示想要的输出。您的代码根本不做任何加入。
  • 什么是practices
  • 如果加入意味着分组,那么您可能根本不需要排序。可能是使用 dict 之类的东西按想要的键分组
  • 我已经更正了。对于那个很抱歉。它应该是 list2 而不是实践。

标签: python list dictionary join merge


【解决方案1】:

defaultdict 对于分组操作可能做得相当好。您可以使用 dict 来更新您的分组元素:

from collections import defaultdict

groups = defaultdict(dict)

# to show this explicitly you can start with two loops
# not the most efficient, but it shows the process
for item in list1:
    k = item['practice']
    groups[k].update(item)

for item in list2:
    k = item['code']
    groups[k].update(item)

# where groups.values() will have your "joined" 
# dictionaries
groups
{
  "N81013": {
    "bnf_code": "0101010G0AAABAB",
    "items": 2,
    "practice": "N81013",
    "bnf_name": "Co-Magaldrox_Susp 195mg/220mg/5ml S/F",
    "nic": 5.98,
    "act_cost": 5.56,
    "quantity": 1000,
    "code": "N81013",
    "name": "SPRINGWOOD SURGERY",
    "addr_1": "SPRINGWOOD SURGERY",
    "addr_2": "RECTORY LANE",
    "borough": "GUISBOROUGH",
    "village": null,
    "post_code": "TS14 7DJ"
  },
  "A81001": {
    "bnf_code": "0101021B0AAAHAH",
    "items": 1,
    "practice": "A81001",
    "bnf_name": "Alginate_Raft-Forming Oral Susp S/F",
    "nic": 1.95,
    "act_cost": 1.82,
    "quantity": 500,
    "code": "A81001",
    "name": "THE DENSHAM SURGERY",
    "addr_1": "THE HEALTH CENTRE",
    "addr_2": "LAWSON STREET",
    "borough": "STOCKTON ON TEES",
    "village": "CLEVELAND",
    "post_code": "TS18 1HU"
  },
  "A81002": {
    "bnf_code": "0101021B0AAALAL",
    "items": 12,
    "practice": "A81002",
    "bnf_name": "Sod Algin/Pot Bicarb_Susp S/F",
    "nic": 64.51,
    "act_cost": 59.95,
    "quantity": 6300,
    "code": "A81002",
    "name": "QUEENS PARK MEDICAL CENTRE",
    "addr_1": "QUEENS PARK MEDICAL CTR",
    "addr_2": "FARRER STREET",
    "borough": "STOCKTON ON TEES",
    "village": "CLEVELAND",
    "post_code": "TS18 2AW"
  },
  "A81004": {
    "bnf_code": "0101021B0AAAPAP",
    "items": 3,
    "practice": "A81004",
    "bnf_name": "Sod Alginate/Pot Bicarb_Tab Chble 500mg",
    "nic": 9.21,
    "act_cost": 8.55,
    "quantity": 180,
    "code": "A81004",
    "name": "WOODLANDS ROAD SURGERY",
    "addr_1": "6 WOODLANDS ROAD",
    "addr_2": null,
    "borough": "MIDDLESBROUGH",
    "village": "CLEVELAND",
    "post_code": "TS1 3BE"
  },
  "A81003": {
    "bnf_code": "0101021B0BEADAJ",
    "items": 6,
    "practice": "A81003",
    "bnf_name": "Gaviscon Infant_Sach 2g (Dual Pack) S/F",
    "nic": 28.92,
    "act_cost": 26.84,
    "quantity": 90,
    "code": "A81003",
    "name": "VICTORIA MEDICAL PRACTICE",
    "addr_1": "THE HEALTH CENTRE",
    "addr_2": "VICTORIA ROAD",
    "borough": "HARTLEPOOL",
    "village": "CLEVELAND",
    "post_code": "TS26 8DB"
  }
}

一般来说,字典非常适​​合分组操作,因为键是唯一的。更优化的操作可能是将zip 两个列表放在一起,因为您将进行更新:

from itertools import zip_longest
from collections import defaultdict

groups = defaultdict(dict)


def group_item(a, b):
    a_key, b_key = a['practice'] if a else None, b['code'] if b else None
    return a_key, b_key

for a, b in zip_longest(list1, list2):
    ak, bk = group_item(a, b)
    if ak:
        groups[ak].update(a)
    if bk:
        groups[bk].update(b)

# sort list of groups.values() now
list(groups.values())
[{'bnf_code': '0101010G0AAABAB', 'items': 2, 'practice': 'N81013', 'bnf_name': 'Co-Magaldrox_Susp 195mg/220mg/5ml S/F', 'nic': 5.98, 'act_cost': 5.56, 'quantity': 1000, 'code': 'N81013', 'name': 'SPRINGWOOD SURGERY', 'addr_1': 'SPRINGWOOD SURGERY', 'addr_2': 'RECTORY LANE', 'borough': 'GUISBOROUGH', 'village': None, 'post_code': 'TS14 7DJ'}, {'code': 'A81001', 'name': 'THE DENSHAM SURGERY', 'addr_1': 'THE HEALTH CENTRE', 'addr_2': 'LAWSON STREET', 'borough': 'STOCKTON ON TEES', 'village': 'CLEVELAND', 'post_code': 'TS18 1HU', 'bnf_code': '0101021B0AAAHAH', 'items': 1, 'practice': 'A81001', 'bnf_name': 'Alginate_Raft-Forming Oral Susp S/F', 'nic': 1.95, 'act_cost': 1.82, 'quantity': 500}, {'code': 'A81002', 'name': 'QUEENS PARK MEDICAL CENTRE', 'addr_1': 'QUEENS PARK MEDICAL CTR', 'addr_2': 'FARRER STREET', 'borough': 'STOCKTON ON TEES', 'village': 'CLEVELAND', 'post_code': 'TS18 2AW', 'bnf_code': '0101021B0AAALAL', 'items': 12, 'practice': 'A81002', 'bnf_name': 'Sod Algin/Pot Bicarb_Susp S/F', 'nic': 64.51, 'act_cost': 59.95, 'quantity': 6300}, {'code': 'A81003', 'name': 'VICTORIA MEDICAL PRACTICE', 'addr_1': 'THE HEALTH CENTRE', 'addr_2': 'VICTORIA ROAD', 'borough': 'HARTLEPOOL', 'village': 'CLEVELAND', 'post_code': 'TS26 8DB', 'bnf_code': '0101021B0BEADAJ', 'items': 6, 'practice': 'A81003', 'bnf_name': 'Gaviscon Infant_Sach 2g (Dual Pack) S/F', 'nic': 28.92, 'act_cost': 26.84, 'quantity': 90}, {'bnf_code': '0101021B0AAAPAP', 'items': 3, 'practice': 'A81004', 'bnf_name': 'Sod Alginate/Pot Bicarb_Tab Chble 500mg', 'nic': 9.21, 'act_cost': 8.55, 'quantity': 180, 'code': 'A81004', 'name': 'WOODLANDS ROAD SURGERY', 'addr_1': '6 WOODLANDS ROAD', 'addr_2': None, 'borough': 'MIDDLESBROUGH', 'village': 'CLEVELAND', 'post_code': 'TS1 3BE'}]

我在这里使用zip_longest,以防您的list1list2 的长度不相等,那么由于大小差异,循环不会被提前截断。要按 post_code 排序,请执行与之前相同的操作:

x = sorted(groups.values(), key=operator.itemgetter('post_code'))

不过,这意味着密钥的存在。对于更通用的方法,lambda 可能会更好,并使用带有默认返回的get

x = sorted(groups.values(), key=lambda x: x.get('post_code', ' '))


【讨论】:

  • 非常感谢。如何删除输出中的“dict_values”。尝试操作输出时出现错误
  • TypeError: 'dict_values' 对象不可调用
  • 你想在任何地方打电话吗?您应该可以使用sorted(groups.values(), key=<lambda>)。您在引发该错误的线路上做什么?
【解决方案2】:

我了解到您希望 list1 中的每个字典都包含 list2 中字典的所有条目,前提是字典的键“code”和“practice”的值匹配。

如果是这样,您可以轻松地使用其他词典中的条目更新词典的所有条目。缺少的键:值对将被添加,现有键的值将得到更新。

所以我最终得到了一个双 for 循环,这是我在任何排序之前完成的。您可能需要根据自己的需要进行调整。

for entry2 in list2:
    for entry1 in list1:
        if entry2['code'] == entry1['practice']:
            entry1.update(entry2)

关于加入字典的不同方法的很长的解释可以在这里找到:https://stackoverflow.com/a/26853961/6218902

【讨论】:

  • 哇!非常感谢。
猜你喜欢
  • 2019-08-18
  • 2021-06-14
  • 1970-01-01
  • 2017-04-20
  • 1970-01-01
  • 2021-09-11
  • 2014-04-27
  • 2021-01-28
  • 2013-05-07
相关资源
最近更新 更多