如何迭代复杂的字典？答案

【问题标题】：How to iterate complicated dictionary?如何迭代复杂的字典？
【发布时间】：2021-03-06 12:53:04
【问题描述】：

首先，我想让您知道，我在堆栈溢出中没有找到特定于我的问题的答案。所以征求你的好意建议。我想遍历下面的嵌套字典来创建下面给出的字符串。能否请教。

features = {
    "ptf_overall2": {
        "1": {
            "groupBy": {
                "1": {"column": "country"},
                "2": {"column": "measurement_group"},
                "3": {"column": "bpid"},
            }
        },
        "2": {
            "number_of_journeys_customer_eligible": {
                "operation": "countDistinct",
                "column": "journeyinstanceid",
            }
        },
        "3": {
            "number_of_journeys_customer_been_contacted": {
                "operation": "sum",
                "column": "journey_email_been_sent_flag",
            }
        },
    }
}

基本上，我需要识别与其关联的那些列的聚合操作和顺序，然后将这些列附加到聚合操作中，如下所示。顺序对我来说非常重要。

ptf_overall2.groupBy('country', 'measurement_group', 'bpid').

通过以下迭代，我得到如下错误

for i in features.get("ptf_overall2"):
    print(features.get("ptf_overall2")[i])
    for j in features.get("ptf_overall2")[i]:
        print(features.get("ptf_overall2")[j])

错误

KeyError: 'GroupBy'
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<command-2704284371067158> in <module>
      2   #print(features.get('ptf_overall2')[i])
      3   for j in features.get('ptf_overall2')[i]:
----> 4     print(features.get('ptf_overall2')[j])

KeyError: 'GroupBy'

【问题讨论】：

这里我没有遵循预期的输出。
@AKX 这应该是输出ptf_overall2.groupBy('country', 'measurement_group', 'bpid')

标签： python dictionary hash nested iteration

【解决方案1】：

也许是这样的。它应该适用于任何只有一个顶级键的字典，然后是一个字典的字典，其中一个应该有一个groupBy 键。

features = {
    "ptf_overall2": {
        "1": {
            "groupBy": {
                "1": {"column": "country"},
                "2": {"column": "measurement_group"},
                "3": {"column": "bpid"},
            }
        },
        "2": {
            "number_of_journeys_customer_eligible": {
                "operation": "countDistinct",
                "column": "journeyinstanceid",
            }
        },
        "3": {
            "number_of_journeys_customer_been_contacted": {
                "operation": "sum",
                "column": "journey_email_been_sent_flag",
            }
        },
    }
}


def get_group_by(d):
    assert (
        len(d) == 1
    ), "dictionary has more than 1 top-level key"
    key = next(iter(d))  # find first (only) key
    for value in d[key].values():
        if "groupBy" in value:  # found the `groupBy` item...
            columns = [
                c["column"]
                for index, c in sorted(value["groupBy"].items())
            ]
            return f"{key}.groupBy{tuple(columns)}"


print(get_group_by(features))

输出如预期的那样，

ptf_overall2.groupBy('country', 'measurement_group', 'bpid')

【讨论】：

非常感谢@AKX 的快速回答。但是想问有没有办法避免从代码中硬编码“groupBy”，而是自动找到键并组合。
你会自动找到什么键？最后你确实想要一个groupBy 表达式......