【问题标题】:Merge two dicts in separate list by id按 id 将两个字典合并到单独的列表中
【发布时间】:2019-10-24 08:31:42
【问题描述】:

我正在尝试基于键 specs 合并对象,大多数键结构是一致的,考虑到只有在 company_name 相同时才会发生合并(在此示例中,我只有一个company_name) 并且如果 only(名称、{颜色、类型、许可证、描述)在多个列表中相等。

[
{
    "company_name": "GreekNLC",
    "metadata": [
        {
            "name": "Bob",
            "details": [
                {
                    "color": "black",
                    "type": "bmw",
                    "license": "4DFLK",
                    "specs": [
                        {
                            "properties": [
                                {
                                    "info": [
                                        "sedan",
                                        "germany"
                                    ]
                                },
                                {
                                    "info": [
                                        "drive",
                                        "expensive"
                                    ]
                                }
                            ]
                        }
                    ],
                    "description": "amazing car"
                }
            ]
        },
        {
            "name": "Bob",
            "car_details": [
                {
                    "color": "black",
                    "type": "bmw",
                    "license": "4DFLK",
                    "specs": [
                        {
                            "properties": [
                                {
                                    "info": [
                                        "powerful",
                                        "convertable"
                                    ]
                                },
                                {
                                    "info": [
                                        "drive",
                                        "expensive"
                                    ]
                                }
                            ]
                        }
                    ],
                    "description": "amazing car"
                }
            ]
        }
    ]
}
]

我希望得到以下输出:

[
{
    "company_name": "GreekNLC",
    "metadata": [
        {
            "name": "Bob",
            "details": [
                {
                    "color": "black",
                    "type": "bmw",
                    "license": "4DFLK",
                    "specs": [
                        {
                            "properties": [
                                {
                                    "info": [
                                        "powerful",
                                        "convertable"
                                    ]
                                },
                                {
                                    "info": [
                                        "sedan",
                                        "germany"
                                    ]
                                },
                                {
                                    "info": [
                                        "drive",
                                        "expensive"
                                    ]
                                }
                            ]
                        }
                    ],
                    "description": "amazing car"
                }
            ]
        }
    ]
}
]

到目前为止我的代码,

headers = ['color', 'license', 'type', 'description']

def _key(d):
  return [d.get(i) for i in headers]

def get_specs(b):
  _specs = [c['properties'] for i in b for c in i['specs']]
  return [{"properties": [i for b in _specs for i in b]}]

def merge(d):
  new_merged_list = [[a, list(b)] for a, b in groupby(sorted(d, key=_key), key=_key)]
  k = [{**dict(zip(headers, a)), 'specs': get_specs(b)} for a, b in new_merged_list]
  return k

result = {'name': merge(c.get("details")) for i in data for c in i.get("metadata")}

print(json.dumps(result))

但它不起作用。我收到了这个

{"name": [{"color": "black", "specs": [{"properties": [{"info": 
["amazing", "strong"]}]}]}]}

【问题讨论】:

  • ast.literal_evalmetod 需要一个字符串 arg,您正在传递一个列表对象。你刚刚纠正了它:)
  • 是的,但我仍然没有看到上面的预期输出。你能帮忙吗@scriptmonster

标签: python arrays merge


【解决方案1】:

您要执行的操作类似于按以下方式分组: company_namenamecolortypelicensedescription

您可以将所有汽车的元组作为键值对,并对生成的元组执行集合操作,按复合键分组并重建列表。

from collections import defaultdict
from collections.abc import Hashable

def merge_spec_props(company_data):
    keyed_tuples = (
                ((
                co['company_name'],
                user['name'], 
                car_detail['color'], 
                car_detail['type'], 
                car_detail['license'],
                car_detail['description'],
                ), (
                    (k, v 
                    if isinstance(v, Hashable)
                    else tuple(v))
                    for k, v in prop.items()
                    )
                )
                for co in company_data
                for user in co['metadata']
                for car_detail in user['car_details']
                for spec in car_detail['specs']
                for prop in spec['properties'] 
                for k, v in prop.items()
                )
    uniq = set(keyed_tuples)
    grouped = defaultdict(list)
    for k, spec in uniq:
        grouped[k].append(spec)

    merged_lst = [
        {
            'company_name': company_name, 
            'metadata': [{
                'name': username,
                'car_details': [{
                        'color': car_color,
                        'type': car_type,
                        'license': car_license,
                        'specs': [dict(spec)
                            for spec in specs
                        ],
                        'description': desc
                }]
            }]
        }
        for (company_name, username, car_color, car_type, car_license, desc), specs in grouped.items()
    ]

    return merged_lst

虽然此实现非常特定于您的数据,并且此函数可能无法作为另一种数据的可重用值。 如果descriptioncar_details 中的任何一个不同,则只会在不同的公司中输入最新的。

值得注意的是,这不会在中间字段上合并。一种可能的方法是将数据转换为树并进行后序横向以获得合并的结构。

【讨论】:

  • 就像一个魅力,感谢@Oluwafemi Sule。不过,一个简单的问题,连同{ "info": [ "sedan", "germany" ]},有时我有{ "info": [ "sedan", "germany" ], "tire_size": "2.45.6"},我得到tire_size以下输出格式"tire_size": ['2','45','6']的项目,这不是我所期望的。为什么会这样?我希望输出为{ "info": [ "sedan", "germany" ], "tire_size": "2.45.6"}
  • 那是因为它被转换为一个元组。如果规范中有多个字段,则需要将最后一次迭代移到键值元组对附近。您可以在转换为元组之前检查它是否可散列。我已经更新了答案以反映这种处理方式。
猜你喜欢
  • 2021-04-10
  • 2011-03-30
  • 2021-07-05
  • 1970-01-01
  • 2022-11-13
  • 1970-01-01
  • 1970-01-01
  • 2021-11-07
相关资源
最近更新 更多