【问题标题】:How to create a tree from a list of subtrees?如何从子树列表中创建树?
【发布时间】:2019-05-17 00:56:09
【问题描述】:

假设我们有一堆看起来像这样的子树:

subtree1 = {
    "id": "root",
    "children": [
        {
            "id": "file",
            "caption": "File",
            "children": []
        },
        {
            "id": "edit",
            "caption": "Edit",
            "children": []
        },
        {
            "id": "tools",
            "caption": "Tools",
            "children": [
                {
                    "id": "packages",
                    "caption": "Packages",
                    "children": []
                }
            ]
        },
        {
            "id": "help",
            "caption": "Help",
            "children": []
        },
    ]
}

subtree2 = {
    "id": "root",
    "children": [
        {
            "id": "file",
            "caption": "File",
            "children": [
                {"caption": "New"},
                {"caption": "Exit"},
            ]        
        }
    ]
}

subtree3 = {
    "id": "root",
    "children": [
        {
            "id": "edit",
            "children": [
                {"caption": "Copy"},
                {"caption": "Cut"},
                {"caption": "Paste"},
            ]
        },
        {
            "id": "help",
            "children": [
                {"caption": "About"},
            ]
        }
    ]
}

subtree4 = {
    "id": "root",
    "children": [
        {
            "id": "edit",
            "children": [
                {
                    "id": "text",
                    "caption": "Text",
                    "children": [
                        { "caption": "Insert line before" },
                        { "caption": "Insert line after" }
                    ]
                }
            ]
        }
    ]
}

我试图弄清楚如何编写 merge 函数,例如执行以下操作:

tree0 = merge(subtree1, subtree2)
tree0 = merge(tree0, subtree3)
tree0 = merge(tree0, subtree4)

将产生:

tree0 = {
    "id": "root",
    "children": [
        {
            "id": "file",
            "caption": "File",
            "children": [
                {"caption": "New"},
                {"caption": "Exit"},
            ]   
        },
        {
            "id": "edit",
            "caption": "Edit",
            "children": [
                {"caption": "Copy"},
                {"caption": "Cut"},
                {"caption": "Paste"},
                {
                    "id": "text",
                    "caption": "Text",
                    "children": [
                        { "caption": "Insert line before" },
                        { "caption": "Insert line after" }
                    ]
                }
            ]
        },
        {
            "id": "tools",
            "caption": "Tools",
            "children": [
                {
                    "id": "packages",
                    "caption": "Packages",
                    "children": []
                }
            ]
        },
        {
            "id": "help",
            "caption": "Help",
            "children": [
                {"caption": "About"},
            ]
        },
    ]
}

但是做这样的事情:

tree1 = merge(subtree1, subtree2)
tree1 = merge(tree1, subtree4)
tree1 = merge(tree1, subtree3)

会产生:

tree1 = {
    "id": "root",
    "children": [
        {
            "id": "file",
            "caption": "File",
            "children": [
                {"caption": "New"},
                {"caption": "Exit"},
            ]   
        },
        {
            "id": "edit",
            "caption": "Edit",
            "children": [
                {
                    "id": "text",
                    "caption": "Text",
                    "children": [
                        { "caption": "Insert line before" },
                        { "caption": "Insert line after" }
                    ]
                },
                {"caption": "Copy"},
                {"caption": "Cut"},
                {"caption": "Paste"},
            ]
        },
        {
            "id": "tools",
            "caption": "Tools",
            "children": [
                {
                    "id": "packages",
                    "caption": "Packages",
                    "children": []
                }
            ]
        },
        {
            "id": "help",
            "caption": "Help",
            "children": [
                {"caption": "About"},
            ]
        },
    ]
}

否则,以相同的顺序加载子树将始终生成相同的树,但是如果您以不同的顺序使用相同的子树列表,则不能保证生成相同的树(因为子列表可以以不同的方式扩展顺序)。

我已经尝试对此进行编码,但我不知道merge 算法的行为如何,这是我的问题。谁能提供代码/伪代码/解释以便我实现它?

PS:下面你会发现一些我认为可以让我取得胜利的随机尝试

if __name__ == '__main__':
    from collections import defaultdict

    subtree1 = {
        "id": "root",
        "children": [
            {
                "id": "file",
                "caption": "File",
                "children": []
            },
            {
                "id": "edit",
                "caption": "Edit",
                "children": []
            },
            {
                "id": "tools",
                "caption": "Tools",
                "children": [
                    {
                        "id": "packages",
                        "caption": "Packages",
                        "children": []
                    }
                ]
            },
            {
                "id": "help",
                "caption": "Help",
                "children": []
            },
        ]
    }

    subtree2 = {
        "id": "root",
        "children": [
            {
                "id": "file",
                "caption": "File",
                "children": [
                    {"caption": "New"},
                    {"caption": "Exit"},
                ]
            }
        ]
    }

    subtree3 = {
        "id": "root",
        "children": [
            {
                "id": "edit",
                "children": [
                    {"caption": "Copy"},
                    {"caption": "Cut"},
                    {"caption": "Paste"},
                ]
            },
            {
                "id": "help",
                "children": [
                    {"caption": "About"},
                ]
            }
        ]
    }

    subtree4 = {
        "id": "root",
        "children": [
            {
                "id": "edit",
                "children": [
                    {
                        "id": "text",
                        "caption": "Text",
                        "children": [
                            {"caption": "Insert line before"},
                            {"caption": "Insert line after"}
                        ]
                    }
                ]
            }
        ]
    }

    lst = [
        subtree1,
        subtree2,
        subtree3,
        subtree4
    ]

    def traverse(node, path=[]):
        yield node, tuple(path)

        for c in node.get("children", []):
            path.append(c.get("id", None))
            yield from traverse(c)
            path.pop()

    # Levels & Hooks
    dct_levels = defaultdict(list)
    dct_hooks = defaultdict(list)
    for subtree in lst:
        for n, p in traverse(subtree):
            if p not in dct_levels[len(p)]:
                dct_levels[len(p)].append(p)
            dct_hooks[p].append(n)

    print(dct_levels)
    print(dct_hooks[("file",)])

    # Merge should happen here
    tree = {
        "id": "root",
        "children": []
    }

    for level in range(1, max(dct_levels.keys()) + 1):
        print("populating level", level, dct_levels[level])

但不确定我是否在这里创建了正确的结构/助手,因为目前还不清楚整个算法是如何工作的......这就是这个问题的全部内容

【问题讨论】:

    标签: python tree


    【解决方案1】:

    在 Python 3.5 上使用您的示例进行测试。

    from copy import deepcopy
    
    
    def merge(x: dict, y: dict) -> dict:
        'Merge subtrees x y, and return the results as a new tree.'
        return merge_inplace(deepcopy(x), y)
    
    
    def merge_inplace(dest: dict, src: dict) -> dict:
        'Merge subtree src into dest, and return dest.'
    
        # perform sanity checks to make the code more rock solid
        # feel free to remove those lines if you don't need
        assert dest.get('id'), 'Cannot merge anonymous subtrees!'
        assert dest.get('id') == src.get('id'), 'Identity mismatch!'
    
        # merge attributes
        dest.update((k, v) for k, v in src.items() if k != 'children')
    
        # merge children
        if not src.get('children'):  # nothing to do, so just exit
            return dest
        elif not dest.get('children'):  # if the children list didn't exist
            dest['children'] = []  # then create an empty list for it
    
        named_dest_children = {
            child['id']: child
            for child in dest['children']
            if 'id' in child
        }
        for child in src['children']:
            if 'id' not in child:  # anonymous child, just append it
                dest['children'].append(child)
            elif child['id'] in named_dest_children:  # override a named subtree
                merge_inplace(named_dest_children[child['id']], child)
            else:  # create a new subtree
                dest['children'].append(child)
                named_dest_children[child['id']] = child
        return dest
    

    【讨论】:

    • 这个答案真的很好,我认为它可能已经在真实案例中使用了......有一个小案例我不确定输出应该是什么,请查看test2在这个code... 如果我在Sublime 上没有弄错,当一个项目具有相同的ID 时它将覆盖新属性,在该代码上,项目root/preferences 最终将具有"caption": "NewPreferences""commnad": "do_something"。我会再次检查 Sublime 以更加确定,但我认为这就是将要发生的事情。也就是说,非常好的答案,+1
    • @BPL 我已经编辑了答案以覆盖属性 - 如果它们中不允许子树,则可以将它们复制到单行中
    【解决方案2】:

    您可以使用itertools.groupby 进行递归:

    from itertools import groupby
    def merge(*args):
       if len(args) < 2 or any('id' not in i for i in args):
          return list(args)
       _d = [(a, list(b)) for a, b in groupby(sorted(args, key=lambda x:x['id']), key=lambda x:x['id'])]
       return [{**{j:k for h in b for j, k in h.items()}, 'id':a, 'children':merge(*[i for c in b for i in c['children']])} for a, b in _d]
    

    通过args,此解决方案将每个传递的字典视为children 列表的成员。这是为了解决可能将两个或多个字典传递给merge 的可能性,这些字典具有不同的ids,即{'id':'root', 'children':[...]}{'id':'root2', 'children':[...]}。因此,此解决方案将返回 [{'id':'root', 'children':[...]}, {'id':'root2', 'children':[...]}] 列表,因为不同的 ids 不提供匹配途径。因此,在当前问题的上下文中,您需要使用索引来访问结果列表的单个返回元素:合并的 dictid 'root'

    import json
    tree0 = merge(subtree1, subtree2)[0]
    tree0 = merge(tree0, subtree3)[0]
    tree0 = merge(tree0, subtree4)[0]
    print(json.dumps(tree0, indent=4))
    

    输出:

    {
      "id": "root",
      "children": [
        {
            "id": "edit",
            "caption": "Edit",
            "children": [
                {
                    "caption": "Copy"
                },
                {
                    "caption": "Cut"
                },
                {
                    "caption": "Paste"
                },
                {
                    "id": "text",
                    "caption": "Text",
                    "children": [
                        {
                            "caption": "Insert line before"
                        },
                        {
                            "caption": "Insert line after"
                        }
                    ]
                }
            ]
        },
        {
            "id": "file",
            "caption": "File",
            "children": [
                {
                    "caption": "New"
                },
                {
                    "caption": "Exit"
                }
            ]
        },
        {
            "id": "help",
            "caption": "Help",
            "children": [
                {
                    "caption": "About"
                }
            ]
        },
        {
            "id": "tools",
            "caption": "Tools",
            "children": [
                {
                    "id": "packages",
                    "caption": "Packages",
                    "children": []
                }
            ]
          }
       ]
    }
    

    【讨论】:

    • 不是所有东西都应该打高尔夫球。 132 个字符行(不计算缩进),带有 5 个 for 循环列表理解,它具有递归且只有单个字符变量名称。这只是一种可怕的做法,会导致代码无法维护。
    • 感谢您的回答,我没有验证/奖励这个答案的原因是因为 Arnie97 首先提供了一个很好的解决方案,而社区认为这是一个更好的答案......虽然我喜欢紧凑型,但 ruohola 有一点道理聪明的代码,这绝对是......所以你去,+1 ;)
    【解决方案3】:

    用于合并 JSON 文档/对象的手动编码可能不是最佳解决方案。干!
    我在这里使用了gensonjsonschemajsonmerge 包进行合并。

    genson 从 JSON 实例文档生成 JSON Schema。
    jsonschema 使用 JSON Schema 验证 JSON 实例文档。
    jsonmerge 通过扩展 JSON Schema 合并对象/JSON 文档。

    让我们首先从 JSON 实例生成 JSON Schema。

    trees = (subtree1, subtree2, subtree3, subtree4)
    schema_builder = genson.SchemaBuilder()
    for tree in trees:
        schema_builder.add_object(tree)
    
    schema = schema_builder.to_schema()
    

    现在指定合并策略。

    schema['properties']['children']['mergeStrategy'] = 'arrayMergeById'
    schema['properties']['children']['items']['properties']['children']['mergeStrategy'] = 'append'
    

    arrayMergeById 策略通过对象的id 属性合并对象。 append 策略将对象收集到一个数组中。
    这是完整的代码;

    import genson
    import jsonmerge
    import jsonschema
    
    subtree1 = {
        "id":
        "root",
        "children": [
            {
                "id": "file",
                "caption": "File",
                "children": []
            },
            {
                "id": "edit",
                "caption": "Edit",
                "children": []
            },
            {
                "id": "tools",
                "caption": "Tools",
                "children": [{
                    "id": "packages",
                    "caption": "Packages",
                    "children": []
                }]
            },
            {
                "id": "help",
                "caption": "Help",
                "children": []
            },
        ]
    }
    
    subtree2 = {
        "id":
        "root",
        "children": [{
            "id": "file",
            "caption": "File",
            "children": [
                {
                    "caption": "New"
                },
                {
                    "caption": "Exit"
                },
            ]
        }]
    }
    
    subtree3 = {
        "id":
        "root",
        "children": [{
            "id":
            "edit",
            "children": [
                {
                    "caption": "Copy"
                },
                {
                    "caption": "Cut"
                },
                {
                    "caption": "Paste"
                },
            ]
        }, {
            "id": "help",
            "children": [
                {
                    "caption": "About"
                },
            ]
        }]
    }
    
    subtree4 = {
        "id":
        "root",
        "children": [{
            "id":
            "edit",
            "children": [{
                "id":
                "text",
                "caption":
                "Text",
                "children": [{
                    "caption": "Insert line before"
                }, {
                    "caption": "Insert line after"
                }]
            }]
        }]
    }
    
    trees = (subtree1, subtree2, subtree3, subtree4)
    schema_builder = genson.SchemaBuilder()
    for tree in trees:
        schema_builder.add_object(tree)
    
    schema = schema_builder.to_schema()
    print("Validating schema...", end='')
    for tree in trees:
        jsonschema.validate(tree, schema)
    print(' done')
    schema['properties']['children']['mergeStrategy'] = 'arrayMergeById'
    schema['properties']['children']['items']['properties']['children']['mergeStrategy'] = 'append'
    
    merger = jsonmerge.Merger(schema=schema)
    tree = merger.merge(subtree1, subtree2)
    tree = merger.merge(tree, subtree3)
    tree = merger.merge(tree, subtree4)
    print(tree)
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2010-10-21
      • 2012-08-23
      • 1970-01-01
      • 1970-01-01
      • 2023-03-09
      • 1970-01-01
      相关资源
      最近更新 更多