将数据框转换为特定的 JSON答案

【问题标题】：Convert dataframe into specific JSON将数据框转换为特定的 JSON
【发布时间】：2021-11-25 12:27:00
【问题描述】：

我想将我的 DataFrame 转换为特定的 JSON。我尝试使用 to_dict() 但目前我没有找到正确的参数来复制输出。

你有什么想法吗？

我的代码：

import pandas as pd
data = {
    'alt' : ["BeattheBeachmark NEW", "BeattheBeachmark NEW"],
    'Mod' : ["GA", "GA"],
    'Pers' : ["Movment", "Movment"],
    'Vie' : ["Inprogress", "Inprogress"],
    'Actions' : ["Clear", "Add"]
}

df = pd.DataFrame(data)

我的输出：

result = {
    "alt" : {
        "BeattheBeachmark NEW" : {
            "Mod" : {
                "GA" :  {
                    "Pers" : {
                        "Movment" : {
                            "Vie" : {
                                "Inprogress" : {
                                    'Actions' : ["Clear", "Add"]
                                }
                            }
                        }
                    }

                }
            }

        }
    }
}

【问题讨论】：

pandas.pydata.org/docs/reference/api/…

标签： python json pandas dictionary

【解决方案1】：

您可以按“alt”、“Mod”...等对数据框进行分组，并在此过程中创建您的字典：

import pandas as pd
import json
data = {
    'alt' : ["BeattheBeachmark NEW", "BeattheBeachmark NEW"],
    'Mod' : ["GA", "GA"],
    'Pers' : ["Movment", "Movment"],
    'Vie' : ["Inprogress", "Inprogress"],
    'Actions' : ["Clear", "Add"]
}

df = pd.DataFrame(data)
output_dict = dict()
output_dict['alt'] = dict()

for alt in df.groupby("alt"):
    output_dict['alt'][alt[0]] = dict()
    output_dict['alt'][alt[0]]["Mod"] = dict()
    for mod in alt[1].groupby("Mod"):
        output_dict['alt'][alt[0]]["Mod"][mod[0]] = dict()
        output_dict['alt'][alt[0]]["Mod"][mod[0]]["Pers"] = dict()
        for pers in mod[1].groupby("Pers"):
            output_dict['alt'][alt[0]]["Mod"][mod[0]]["Pers"][pers[0]] = dict()
            output_dict['alt'][alt[0]]["Mod"][mod[0]]["Pers"][pers[0]]["Vie"] = dict()
            for vie in pers[1].groupby("Vie"):
                output_dict['alt'][alt[0]]["Mod"][mod[0]]["Pers"][pers[0]]["Vie"][vie[0]] = dict()
                output_dict['alt'][alt[0]]["Mod"][mod[0]]["Pers"][pers[0]]["Vie"][vie[0]]["Actions"] = list(vie[1].Actions)

print(json.dumps(output_dict, indent=4))

输出：

{
    "alt": {
        "BeattheBeachmark NEW": {
            "Mod": {
                "GA": {
                    "Pers": {
                        "Movment": {
                            "Vie": {
                                "Inprogress": {
                                    "Actions": [
                                        "Clear",
                                        "Add"
                                    ]
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}

编辑：出于存档目的，我为此类问题添加了一个递归解决方案，使其更加通用：

import pandas as pd
import json
data = {
    'alt' : ["BeattheBeachmark NEW", "BeattheBeachmark NEW"],
    'Mod' : ["GA", "GA"],
    'Pers' : ["Movment", "Movment"],
    'Vie' : ["Inprogress", "Inprogress"],
    'Actions' : ["Clear", "Add"]
}

df_in = pd.DataFrame(data)
output_dict = dict()

def extract_columns(df, col, output_dict):
    if col == len(df.columns)-1:
        output_dict[df.columns[col]] = list(df[df.columns[col]])
    else:
        output_dict[df.columns[col]] = dict()
        for first_col_grp in df.groupby(df.columns[col]):
            output_dict[df.columns[col]][first_col_grp[0]] = dict()
            extract_columns(first_col_grp[1], col+1, output_dict[df.columns[col]][first_col_grp[0]])


extract_columns(df_in, 0, output_dict)

print(json.dumps(output_dict, indent=4))

【讨论】：

【解决方案2】：

要获得与示例中相同的字典，您可以遍历数据框的列并创建字典（使用文字评估来帮助，因为 df.to_json 返回一个字符串并且您需要一个列表）：

import ast
your_dict = {}

for col in df.columns:
    your_dict[col] = df[col].to_json(orient='records')
    your_dict[col] = ast.literal_eval(your_dict[col])

print(your_dict)

给你：

{'alt': ['BeattheBeachmark NEW', 'BeattheBeachmark NEW'],
 'Mod': ['GA', 'GA'],
 'Pers': ['Movment', 'Movment'],
 'Vie': ['Inprogress', 'Inprogress'],
 'Actions': ['Clear', 'Add']}

【讨论】：