【发布时间】:2019-01-11 13:11:13
【问题描述】:
我正在使用标准数据框并使用嵌套数组创建汇总数据的各种子集数据框。然后,我需要以某种方式组合子集数据帧,从而为我提供预期的 JSON 输出。 (我使用 MaxU 的答案来格式化我的大部分代码;Convert Pandas Dataframe to nested JSON)
我的标准数据框的前几行(如果需要,我可以给出这个例子中的所有 58 行):df
ID PRI_AFF PRI_DEP LOA STATE
0 5571 M Basic A
1 5030 T 14700000 Blue A
2 5030 T 14700000 Blue A
3 5030 T 14700000 Blue A
4 4014 T 14700000 Blue A
5 2230 T 14700000 UFM A
6 2230 T 14700000 UFM A
7 2150 F 95011000 Bronze A
8 2150 F 95011000 Bronze A
9 2150 F 95011000 Bronze A
10 2150 F 95011000 Bronze A
从这里我使用以下 Python:
PAFF_df = pd.DataFrame(df.groupby(['PRI_DEP','PRI_AFF'])['ID'].nunique().unstack().reset_index().fillna(0))
LOA_df = pd.DataFrame(df.groupby(['PRI_DEP','LOA'])['ID'].nunique().unstack().reset_index().fillna(0))
ST_df = pd.DataFrame(df.groupby(['PRI_DEP','STATE'])['ID'].nunique().unstack().reset_index().fillna(0))
Nested_PAFF_df = (PAFF_df.groupby(['PRI_DEP'], as_index=True)
.apply(lambda x: x[['A','E','F','L','M','T']].to_dict('r'))
.reset_index()
.rename(columns={0:'Primary_Affiliation'}))
Nested_LOA_df = (LOA_df.groupby(['PRI_DEP'], as_index=True)
.apply(lambda x: x[['Basic','Blue','Bronze','Invalid','UFM']].to_dict('r'))
.reset_index()
.rename(columns={0:'LOA'}))
Nested_ST_df = (ST_df.groupby(['PRI_DEP'], as_index=True)
.apply(lambda x: x[['A','E']].to_dict('r'))
.reset_index()
.rename(columns={0:'STATE'}))
这给了我适当的嵌套 JSON 使用:.to_json(orient='records')
主要从属关系 JSON:
[{"PRI_DEP":" ","Primary_Affiliation":[{"A":0.0,"E":0.0,"F":0.0,"M":2.0,"L":0.0,"T":0.0}]},{"PRI_DEP":"14700000","Primary_Affiliation":[{"A":0.0,"E":3.0,"F":0.0,"M":1.0,"L":1.0,"T":19.0}]},{"PRI_DEP":"95011000","Primary_Affiliation":[{"A":0.0,"E":0.0,"F":1.0,"M":0.0,"L":0.0,"T":0.0}]},{"PRI_DEP":"Null","Primary_Affiliation":[{"A":0.0,"E":1.0,"F":0.0,"M":0.0,"L":0.0,"T":0.0}]},{"PRI_DEP":"ST010000","Primary_Affiliation":[{"A":1.0,"E":0.0,"F":0.0,"M":0.0,"L":0.0,"T":1.0}]}]
LOA JSON:
[{"PRI_DEP":" ","LOA":[{"Blue":0.0,"UFM":0.0,"Invalid":0.0,"Bronze":1.0,"Basic":1.0}]},{"PRI_DEP":"14700000","LOA":[{"Blue":14.0,"UFM":5.0,"Invalid":1.0,"Bronze":4.0,"Basic":0.0}]},{"PRI_DEP":"95011000","LOA":[{"Blue":0.0,"UFM":0.0,"Invalid":0.0,"Bronze":1.0,"Basic":0.0}]},{"PRI_DEP":"Null","LOA":[{"Blue":0.0,"UFM":0.0,"Invalid":0.0,"Bronze":1.0,"Basic":0.0}]},{"PRI_DEP":"ST010000","LOA":[{"Blue":0.0,"UFM":0.0,"Invalid":1.0,"Bronze":0.0,"Basic":1.0}]}]
状态 JSON:
[{"PRI_DEP":" ","STATE":[{"A":2.0,"E":0.0}]},{"PRI_DEP":"14700000","STATE":[{"A":23.0,"E":1.0}]},{"PRI_DEP":"95011000","STATE":[{"A":1.0,"E":0.0}]},{"PRI_DEP":"Null","STATE":[{"A":1.0,"E":0.0}]},{"PRI_DEP":"ST010000","STATE":[{"A":2.0,"E":0.0}]}]
现在我想通过 PRI_DEP 以某种方式将所有这些都表示在一个 JSON 中。
所以想要的 JSON 应该是这样的(为了便于阅读而更新):
[{"PRI_DEP":" ",
"Primary_Affiliation":
[{"A":0.0,"E":0.0,"F":0.0,"M":2.0,"L":0.0,"T":0.0}],
"LOA":
[{"Blue":0.0,"UFM":0.0,"Invalid":0.0,"Bronze":1.0,"Basic":1.0}],
"STATE":
[{"A":2.0,"E":0.0}]},
{"PRI_DEP":"14700000",
"Primary_Affiliation":
[{"A":0.0,"E":3.0,"F":0.0,"M":1.0,"L":1.0,"T":19.0}],
"LOA":
[{"Blue":14.0,"UFM":5.0,"Invalid":1.0,"Bronze":4.0,"Basic":0.0}],
"STATE":
[{"A":23.0,"E":1.0}]},
{"PRI_DEP":"95011000",
"Primary_Affiliation":
[{"A":0.0,"E":0.0,"F":1.0,"M":0.0,"L":0.0,"T":0.0}],
"LOA":
[{"Blue":0.0,"UFM":0.0,"Invalid":0.0,"Bronze":1.0,"Basic":0.0}],
"STATE":
[{"A":1.0,"E":0.0}]},
{"PRI_DEP":"Null",
"Primary_Affiliation":
[{"A":0.0,"E":1.0,"F":0.0,"M":0.0,"L":0.0,"T":0.0}],
"LOA":
[{"Blue":0.0,"UFM":0.0,"Invalid":0.0,"Bronze":1.0,"Basic":0.0}],
"STATE":
[{"A":1.0,"E":0.0}]},
{"PRI_DEP":"ST010000",
"Primary_Affiliation":
[{"A":1.0,"E":0.0,"F":0.0,"M":0.0,"L":0.0,"T":1.0}],
"LOA":
[{"Blue":0.0,"UFM":0.0,"Invalid":1.0,"Bronze":0.0,"Basic":1.0}],
"STATE":
[{"A":2.0,"E":0.0}]}]
【问题讨论】:
-
看起来你想要的 JSON 被截断了。可以更新吗?
-
我故意只放第一条记录,但我会用剩下的记录更新。只有几个。
标签: python json python-2.7 pandas dictionary