【问题标题】:pandas json normalize key error with a particular json attributepandas json 使用特定的 json 属性规范化键错误
【发布时间】:2023-03-16 19:09:02
【问题描述】:

我有一个json:

mytestdata = {
    "success": True,
    "message": "",
    "data": {
        "totalCount": 95,
        "goal": [
            {
                "user_id": 123455,
                "user_email": "john.smith@test.com",
                "user_first_name": "John",
                "user_last_name": "Smith",
                "people_goals": [
                    {
                        "goal_id": 545555,
                        "goal_name": "test goal name",
                        "goal_owner": "123455",
                        "goal_narrative": "",
                        "goal_type": {
                            "id": 1,
                            "name": "Team"
                        },
                        "goal_create_at": "1595874095",
                        "goal_modified_at": "1595874095",
                        "goal_created_by": "123455",
                        "goal_updated_by": "123455",
                        "goal_start_date": "1593561600",
                        "goal_target_date": "1601424000",
                        "goal_progress": "34",
                        "goal_progress_color": "#ff9933",
                        "goal_status": "1",
                        "goal_permission": "internal,team",
                        "goal_category": [],
                        "goal_owner_full_name": "John Smith",
                        "goal_team_id": "766754",
                        "goal_team_name": "",
                        "goal_workstreams": []
                    }
                ]
            }
        ]
    }
}

我正在尝试使用 json_normalize 显示“people_goals”以及“user_last_name”、“user_first_name”、“user_email”、“user_id”中的所有详细信息。 到目前为止,我可以使用代码显示“people_goals”、“user_first_name”、“user_email”

df2 = pd.json_normalize(data=mytestdata['data'], record_path=['goal', 'people_goals'], 
meta=[['goal','user_first_name'], ['goal','user_last_name'], ['goal','user_email']], errors='ignore')

但是,当我尝试在 meta=[] 中包含 ['goal', 'user_id'] 时遇到问题 错误是:

TypeError                                 Traceback (most recent call last)
<ipython-input-192-b7a124a075a0> in <module>
      7 df2 = pd.json_normalize(data=mytestdata['data'], record_path=['goal', 'people_goals'], 
      8                         meta=[['goal','user_first_name'], ['goal','user_last_name'], ['goal','user_email'], ['goal','user_id']],
----> 9                         errors='ignore')
     10 
     11 # df2 = pd.json_normalize(data=mytestdata['data'], record_path=['goal', 'people_goals'])

我看到的“user_id”的唯一区别是它不是字符串 我在这里遗漏了什么吗?

【问题讨论】:

  • 您的代码在我尝试时运行良好。
  • 看起来问题出在旧版本的熊猫上。一旦我将 pandas 升级到 1.1.1,它就开始工作了。谢谢@NikhilKhandelwal

标签: json pandas dataframe typeerror normalize


【解决方案1】:

您的代码在我的平台上运行。由于两个原因,我不再使用 record_pathmeta 参数。 a) 它们很难解决 b) pandas 的版本之间存在兼容性问题

因此我现在多次使用json_normalize()的方法来逐步扩展JSON。或使用pd.Series。已将两者作为示例。

df = pd.json_normalize(data=mytestdata['data']).explode("goal")
df = pd.concat([df, df["goal"].apply(pd.Series)], axis=1).drop(columns="goal").explode("people_goals")
df = pd.concat([df, df["people_goals"].apply(pd.Series)], axis=1).drop(columns="people_goals")
df = pd.concat([df, df["goal_type"].apply(pd.Series)], axis=1).drop(columns="goal_type")
df.T

df2 = pd.json_normalize(pd.json_normalize(
    pd.json_normalize(data=mytestdata['data']).explode("goal").to_dict(orient="records")
).explode("goal.people_goals").to_dict(orient="records"))
df2.T

print(df.T.to_string())

输出

                                        0
totalCount                             95
user_id                            123455
user_email            john.smith@test.com
user_first_name                      John
user_last_name                      Smith
goal_id                            545555
goal_name                  test goal name
goal_owner                         123455
goal_narrative                           
goal_create_at                 1595874095
goal_modified_at               1595874095
goal_created_by                    123455
goal_updated_by                    123455
goal_start_date                1593561600
goal_target_date               1601424000
goal_progress                          34
goal_progress_color               #ff9933
goal_status                             1
goal_permission             internal,team
goal_category                          []
goal_owner_full_name           John Smith
goal_team_id                       766754
goal_team_name                           
goal_workstreams                       []
id                                      1
name                                 Team

【讨论】:

  • 谢谢,这很有帮助
猜你喜欢
  • 1970-01-01
  • 2021-04-26
  • 2020-12-11
  • 2019-12-22
  • 2020-06-21
  • 1970-01-01
  • 1970-01-01
  • 2018-12-07
  • 2019-05-07
相关资源
最近更新 更多