【问题标题】:Selective merge on python dictionariespython字典上的选择性合并
【发布时间】:2018-10-06 13:02:27
【问题描述】:

我在 python (d1, d2) 中有 2 个字典,我需要将缺失的“id”项从 d2 传递到 d1,忽略任何其他差异(例如 d1 中的额外“子”)。实际上需要的是结果字典只是添加了“id”项的 d1。我尝试过合并,但由于任何一种方式都会丢失数据,因此它不起作用。

d1 = {
    "parent": {
        "name": "Axl",
        "surname": "Doe",
        "children": [
            {
                "name": "John",
                "surname": "Doe"
            },                
            {
                "name": "Jane",
                "surname": "Doe",
                "children": [
                    {
                        "name": "Jim",
                        "surname": "Doe"
                    },
                    {
                        "name": "Kim",
                        "surname": "Doe"
                    }
                ]
            }
        ]
    }
}

d2 = {
    "parent": {
        "id": 1,
        "name": "Axl",
        "surname": "Doe",
        "children": [
            {
                "id": 2,
                "name": "John",
                "surname": "Doe"
            },
            {
                "id": 3,
                "name": "Jane",
                "surname": "Doe",
                "children": [
                    {
                        "id": 4,
                        "name": "Jim",
                        "surname": "Doe"
                    },
                    {
                        "id": 5,
                        "name": "Kim",
                        "surname": "Doe"
                    },
                    {
                        "id": 6
                        "name": "Bill",
                        "surname": "Doe"
                    },
                ]
            }
        ]
    }
}

result = {
"parent": {
    "id": 1,
    "name": "Axl",
    "surname": "Doe",
    "children": [
        {
            "id": 2,
            "name": "John",
            "surname": "Doe"
        },
        {
            "id": 3,
            "name": "Jane",
            "surname": "Doe",
            "children": [
                {
                    "id": 4,
                    "name": "Jim",
                    "surname": "Doe"
                },
                {
                    "id": 5,
                    "name": "Kim",
                    "surname": "Doe"
                }
            ]
        }
    ]
}

}

有什么想法吗?

【问题讨论】:

  • 名称是否唯一?
  • 其实没有...这是一个示例dict,只是为了展示结构
  • 孩子的名字是唯一的吗?即单亲只能有每个(name, surname)的1个孩子?我只是想看看我们如何在不匹配整个孩子序列并产生O(n^2) 成本的情况下解决这个问题。
  • 是的,这是正确的。我的意思是不同父母的名字不是唯一的

标签: python dictionary merge compare


【解决方案1】:

我根据一个关键函数匹配孩子,在本例中是“name”和“surname”属性。

然后我检查id_lookup dict(在您的示例中命名为d2)并尝试将每个孩子与main_dict 的孩子匹配。如果我找到匹配项,我会递归到它。

最后,main_dict(或您的示例中的 d1)填充了 ID :-)

import operator

root = main_dict["parent"]
lookup_root = id_lookup_dict["parent"]

keyfunc = operator.itemgetter("name", "surname")

def _recursive_fill_id(root, lookup_root, keyfunc):
    """Recursively fill root node with IDs

    Matches nodes according to keyfunc
    """
    root["id"] = lookup_root["id"]

    # Fetch children
    root_children = root.get("children")

    # There are no children
    if root_children is None:
        return

    children_left = len(root_children)

    # Create a dict mapping the key identifying a child to the child
    # This avoids a hefty lookup cost and requires a single iteration.
    children_dict = dict(zip(map(keyfunc, root_children), root_children))

    for lookup_child in lookup_root["children"]:
        lookup_key = keyfunc(lookup_child)
        matching_child = children_dict.get(lookup_key)

        if matching_child is not None:
            _recursive_fill_id(matching_child, lookup_child, keyfunc)

            # Short circuit in case all children were filled
            children_left -= 1
            if not children_left:
                break

_recursive_fill_id(root, lookup_root, keyfunc)

【讨论】:

  • 其实你删的那个我试过了,好像没问题!你觉得这样更有效率吗?老实说,前一个是更通用的解决方案。非常感谢您提供这种非常快速的解决方法!
  • 还是一样的,只是改了个名字,这样更易​​读。
【解决方案2】:

我希望添加一个迭代答案而不是递归答案,因为它可能会被证明更有效。

它不会达到任何堆栈阈值,并且会更快一些:

import operator

root = main_dict["parent"]
lookup_root = id_lookup_dict["parent"]

keyfunc = operator.itemgetter("name", "surname")

def _recursive_fill_id(root, lookup_root, keyfunc):
    """Recursively fill root node with IDs

    Matches nodes according to keyfunc
    """
    matching_nodes = [(root, lookup_root)]

    while matching_nodes:
        root, lookup_root = matching_nodes.pop()
        root["id"] = lookup_root["id"]

        # Fetch children
        root_children = root.get("children")

        # There are no children
        if root_children is None:
            continue

        children_left = len(root_children)

        # Create a dict mapping the key identifying a child to the child
        # This avoids a hefty lookup cost and requires a single iteration.
        children_dict = dict(zip(map(keyfunc, root_children), root_children))

        for lookup_child in lookup_root["children"]:
            lookup_key = keyfunc(lookup_child)
            matching_child = children_dict.get(lookup_key)

            if matching_child is not None:
                matching_nodes.append((matching_child, lookup_child))

                # Short circuit in case all children were filled
                children_left -= 1
                if not children_left:
                    break


_recursive_fill_id(root, lookup_root, keyfunc)

【讨论】:

    猜你喜欢
    • 2011-01-22
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2021-10-06
    • 2020-11-18
    相关资源
    最近更新 更多