将嵌套的 JSON 结构转换为 Pandas 数据帧答案

【问题标题】：Converting nested JSON structures to Pandas DataFrames将嵌套的 JSON 结构转换为 Pandas 数据帧
【发布时间】：2021-03-30 15:56:38
【问题描述】：

我一直在为 json 中的嵌套结构而苦苦挣扎，如何转换为正确的形式

{
"id": "0c576f35-d704-4fa8-8cbb-311c6be36358",
"employee_id": null,
"creator_id": "16ca2db9-206c-4e18-891d-a00a5252dbd3",
"closed_by_id": null,
"request_number": 23,
"priority": "2",
"form_id": "urlaub-weitere-abwesenheiten",
"status": "opened",
"name": "Urlaub & weitere Abwesenheiten",
"read_by_employee": false,
"custom_status": {
    "id": 15793,
    "name": "In Bearbeitung HR"
},
"due_date": null,
"created_at": "2021-03-29T15:18:37.572040+02:00",
"updated_at": "2021-03-29T15:22:15.590156+02:00",
"closed_at": null,
"archived_at": null,
"attachment_count": 1,
"category": {
    "id": "payroll-time-management",
    "name": "Payroll, Time & Attendance"
},
"public_comment_count": 0,
"form_data": [
    {
        "field_id": "subcategory",
        "values": [
            "Time & Attendance - Manage monthly/year-end consolidation and report"
        ]
    },
    {
        "field_id": "separator-2",
        "values": [
            null
        ]
    },
    {
        "field_id": "art-der-massnahme",
        "values": [
            "Fortbildung"
        ]
    },
    {
        "field_id": "bezeichnung-der-schulung-kurses",
        "values": [
            "dfgzhujiko"
        ]
    },
    {
        "field_id": "startdatum",
        "values": [
            "2021-03-26"
        ]
    },
    {
        "field_id": "enddatum",
        "values": [
            "2021-03-27"
        ]
    },
    {
        "field_id": "freistellung",
        "values": [
            "nein"
        ]
    },
    {
        "field_id": "mit-bildungsurlaub",
        "values": [
            ""
        ]
    },
    {
        "field_id": "kommentarfeld_fortbildung",
        "values": [
            ""
        ]
    },
    {
        "field_id": "separator",
        "values": [
            null
        ]
    },
    {
        "field_id": "instructions",
        "values": [
            null
        ]
    },
    {
        "field_id": "entscheidung-hr-bp",
        "values": [
            "Zustimmen"
        ]
    },
    {
        "field_id": "kommentarfeld-hr-bp",
        "values": [
            "wsdfghjkmhnbgvfcdxsybvnm,"
        ]
    },
    {
        "field_id": "individuelle-abstimmung",
        "values": [
            ""
        ]
    }
],
"form_files": [
    {
        "id": 30129,
        "filename": "empty_background.png",
        "field_id": "anhang"
    }
],
"visible_by_employee": false,
"organization_ids": [],
"need_edit_by_employee": false,
"attachments": []

}

使用 pandas 的简单解决方案，dataframe

Request = pd.DataFrame.from_dict(pd.json_normalize(data), orient='columns')

它几乎以正确的形式显示：

如何从form_data列和form_files中拆分字典，我已经做了很多研究，但是我在解决这个问题时仍然遇到很多麻烦，如何将form_data拆分为列，没有将meta的行拆分为ID

【问题讨论】：

标签： python json pandas dataframe

【解决方案1】：

你可以这样做。

将dataframe 和列作为参数传递给函数

def explode_node(child_df, column_value):
    child_df = child_df.dropna(subset=[column_value])
    if isinstance(child_df[str(column_value)].iloc[0], str):
        child_df[column_value] = child_df[str(column_value)].apply(ast.literal_eval)
    expanded_child_df = (pd.concat({i: json_normalize(x) for i, x in child_df.pop(str(column_value)).items()}).reset_index(level=1,drop=True).join(child_df, how='right', lsuffix='_left', rsuffix='_right').reset_index(drop=True))
    expanded_child_df.columns = map(str.lower, expanded_child_df.columns)

    return expanded_child_df

【讨论】：

谢谢！为了您的支持：*
很高兴为您提供帮助。如果有帮助，请将答案标记为已接受