【问题标题】:Pandas: Combining DataFrames with nested arrays or merging the JSON outputPandas:将 DataFrame 与嵌套数组结合或合并 JSON 输出
【发布时间】:2019-01-11 13:11:13
【问题描述】:

我正在使用标准数据框并使用嵌套数组创建汇总数据的各种子集数据框。然后,我需要以某种方式组合子集数据帧,从而为我提供预期的 JSON 输出。 (我使用 MaxU 的答案来格式化我的大部分代码;Convert Pandas Dataframe to nested JSON

我的标准数据框的前几行(如果需要,我可以给出这个例子中的所有 58 行):df

    ID         PRI_AFF   PRI_DEP      LOA    STATE
0   5571             M              Basic        A
1   5030             T  14700000     Blue        A
2   5030             T  14700000     Blue        A
3   5030             T  14700000     Blue        A
4   4014             T  14700000     Blue        A
5   2230             T  14700000      UFM        A
6   2230             T  14700000      UFM        A
7   2150             F  95011000   Bronze        A
8   2150             F  95011000   Bronze        A
9   2150             F  95011000   Bronze        A
10  2150             F  95011000   Bronze        A

从这里我使用以下 Python:

 PAFF_df = pd.DataFrame(df.groupby(['PRI_DEP','PRI_AFF'])['ID'].nunique().unstack().reset_index().fillna(0))
 LOA_df = pd.DataFrame(df.groupby(['PRI_DEP','LOA'])['ID'].nunique().unstack().reset_index().fillna(0))
 ST_df = pd.DataFrame(df.groupby(['PRI_DEP','STATE'])['ID'].nunique().unstack().reset_index().fillna(0))

 Nested_PAFF_df = (PAFF_df.groupby(['PRI_DEP'], as_index=True)
      .apply(lambda x: x[['A','E','F','L','M','T']].to_dict('r'))
      .reset_index()
      .rename(columns={0:'Primary_Affiliation'}))

 Nested_LOA_df = (LOA_df.groupby(['PRI_DEP'], as_index=True)
      .apply(lambda x: x[['Basic','Blue','Bronze','Invalid','UFM']].to_dict('r'))
      .reset_index()
      .rename(columns={0:'LOA'}))

 Nested_ST_df = (ST_df.groupby(['PRI_DEP'], as_index=True)
      .apply(lambda x: x[['A','E']].to_dict('r'))
      .reset_index()
      .rename(columns={0:'STATE'}))

这给了我适当的嵌套 JSON 使用:.to_json(orient='records')

主要从属关系 JSON:

[{"PRI_DEP":" ","Primary_Affiliation":[{"A":0.0,"E":0.0,"F":0.0,"M":2.0,"L":0.0,"T":0.0}]},{"PRI_DEP":"14700000","Primary_Affiliation":[{"A":0.0,"E":3.0,"F":0.0,"M":1.0,"L":1.0,"T":19.0}]},{"PRI_DEP":"95011000","Primary_Affiliation":[{"A":0.0,"E":0.0,"F":1.0,"M":0.0,"L":0.0,"T":0.0}]},{"PRI_DEP":"Null","Primary_Affiliation":[{"A":0.0,"E":1.0,"F":0.0,"M":0.0,"L":0.0,"T":0.0}]},{"PRI_DEP":"ST010000","Primary_Affiliation":[{"A":1.0,"E":0.0,"F":0.0,"M":0.0,"L":0.0,"T":1.0}]}] 

LOA JSON:

[{"PRI_DEP":" ","LOA":[{"Blue":0.0,"UFM":0.0,"Invalid":0.0,"Bronze":1.0,"Basic":1.0}]},{"PRI_DEP":"14700000","LOA":[{"Blue":14.0,"UFM":5.0,"Invalid":1.0,"Bronze":4.0,"Basic":0.0}]},{"PRI_DEP":"95011000","LOA":[{"Blue":0.0,"UFM":0.0,"Invalid":0.0,"Bronze":1.0,"Basic":0.0}]},{"PRI_DEP":"Null","LOA":[{"Blue":0.0,"UFM":0.0,"Invalid":0.0,"Bronze":1.0,"Basic":0.0}]},{"PRI_DEP":"ST010000","LOA":[{"Blue":0.0,"UFM":0.0,"Invalid":1.0,"Bronze":0.0,"Basic":1.0}]}] 

状态 JSON:

[{"PRI_DEP":" ","STATE":[{"A":2.0,"E":0.0}]},{"PRI_DEP":"14700000","STATE":[{"A":23.0,"E":1.0}]},{"PRI_DEP":"95011000","STATE":[{"A":1.0,"E":0.0}]},{"PRI_DEP":"Null","STATE":[{"A":1.0,"E":0.0}]},{"PRI_DEP":"ST010000","STATE":[{"A":2.0,"E":0.0}]}] 

现在我想通过 PRI_DEP 以某种方式将所有这些都表示在一个 JSON 中。

所以想要的 JSON 应该是这样的(为了便于阅读而更新):

[{"PRI_DEP":" ",
    "Primary_Affiliation":
        [{"A":0.0,"E":0.0,"F":0.0,"M":2.0,"L":0.0,"T":0.0}],
    "LOA": 
        [{"Blue":0.0,"UFM":0.0,"Invalid":0.0,"Bronze":1.0,"Basic":1.0}],
    "STATE":
        [{"A":2.0,"E":0.0}]},
 {"PRI_DEP":"14700000",
    "Primary_Affiliation": 
        [{"A":0.0,"E":3.0,"F":0.0,"M":1.0,"L":1.0,"T":19.0}],
    "LOA": 
        [{"Blue":14.0,"UFM":5.0,"Invalid":1.0,"Bronze":4.0,"Basic":0.0}],
    "STATE":
        [{"A":23.0,"E":1.0}]}, 
 {"PRI_DEP":"95011000",
    "Primary_Affiliation":
        [{"A":0.0,"E":0.0,"F":1.0,"M":0.0,"L":0.0,"T":0.0}],
    "LOA":
        [{"Blue":0.0,"UFM":0.0,"Invalid":0.0,"Bronze":1.0,"Basic":0.0}],
    "STATE":
        [{"A":1.0,"E":0.0}]},
 {"PRI_DEP":"Null",
    "Primary_Affiliation": 
        [{"A":0.0,"E":1.0,"F":0.0,"M":0.0,"L":0.0,"T":0.0}],
    "LOA":
        [{"Blue":0.0,"UFM":0.0,"Invalid":0.0,"Bronze":1.0,"Basic":0.0}],
    "STATE":
        [{"A":1.0,"E":0.0}]},
 {"PRI_DEP":"ST010000",
    "Primary_Affiliation":
        [{"A":1.0,"E":0.0,"F":0.0,"M":0.0,"L":0.0,"T":1.0}],
    "LOA":
        [{"Blue":0.0,"UFM":0.0,"Invalid":1.0,"Bronze":0.0,"Basic":1.0}],
    "STATE":
        [{"A":2.0,"E":0.0}]}]

【问题讨论】:

  • 看起来你想要的 JSON 被截断了。可以更新吗?
  • 我故意只放第一条记录,但我会用剩下的记录更新。只有几个。

标签: python json python-2.7 pandas dictionary


【解决方案1】:

我一直在尝试不同的数据帧组合方式,我想我找到了答案。

在我的原始帖子中的 python 代码(设置嵌套组)之后,我做了以下操作:

Group_frames = [Nested_PAFF_df.set_index('PRI_DEP'), Nested_LOA_df.set_index('PRI_DEP'), Nested_ST_df.set_index('PRI_DEP')]
result = pd.concat(Group_frames, axis=1).reset_index()
print(result.to_json(orient='records'))

【讨论】:

    猜你喜欢
    • 2018-11-25
    • 2019-11-10
    • 2020-02-25
    • 2023-03-11
    • 2016-04-01
    • 2015-06-20
    • 2017-04-05
    • 2021-07-15
    • 2020-09-29
    相关资源
    最近更新 更多