【问题标题】:How can I create a nested JSON file from a Pandas dataframe in Python?如何从 Python 中的 Pandas 数据框创建嵌套的 JSON 文件?
【发布时间】:2020-07-12 17:08:13
【问题描述】:

我有如下数据框:

 df.head(2)

Ord MOT  MVT  CUST  CreationSla  CreationPlanned CreationProposed  PickupSla  PickupPlanned PickupProposed
 12  TR    TT   DEA  12-3-2020      12-3-2020      12-3-2020       14-3-2020   14-3-2020    14-3-2020
 15  ZR    TD   DET  15-3-2020      15-3-2020      15-3-2020       16-3-2020   16-3-2020    16-3-2020

我想创建一个以下格式的嵌套 JSON 文件:

预期输出

{
    "Ord" : "12",
    "MOT" : "TR",
    "MVT" : "TT",
    "CUST" : "DEA",
    "milestone" : {
        "creation" : {
            "sla" : "12-3-2020",
            "plan" : "12-3-2020",
            "proposed" : "12-3-2020"
        },
        "Pickup" : {
            "sla" : "14-3-2020",
            "plan" : "14-3-2020",
            "proposed" : "14-3-2020"
        }
    }
}

如何在 Python 中做到这一点?

【问题讨论】:

    标签: python json python-3.x pandas dataframe


    【解决方案1】:

    首先,您需要遍历数据框的行。

    然后为每一行创建一个字典。完整的想法是这样的:

    result = []
    predefined_columns = ['Creation', 'Pickup', 'Departure']
    mapping_cl = []
    for column in df.columns:
        flag = False
        for sub_str in predefined_columns:
            if sub_str in column:
                flag = True
                mapping_cl.append(sub_str)
                break
        if not flag:
            mapping_cl.append(False)
    for index, row in df.iterrows():
        item = {}
        for cl in mapping_cl:
            if cl:
                item[cl] = {}
        for i, column in enumerate(df.columns):
            if mapping_cl[i]:
                cl_name = column.split(mapping_cl[i])[-1]
                item[mapping_cl[i]][cl_name] = row[column]
            else:
                item[column] = row[column]
        result.append(item)
    

    现在result 是您想要的dict 列表:

    【讨论】:

    • 如果我有提到其时间戳的子列 Deperature。
    • 所以关键是您需要预定义所有要拆分的列名称。然后,您将检查该列是否出现在该列表中,如果出现,则将其拆分。我将更改我的代码以适合它
    • 那太好了
    • 希望能有所帮助
    • 我刚刚编辑过,希望这篇文章能用尽可能多的predefined 列解决您的问题
    【解决方案2】:

    您可以创建 JSON 模板并向其发送数据:

    d = """{
        "Ord" : "%s",
        "MOT" : "%s",
        "MVT" : "%s",
        "CUST" : "%s",
        "milestone" : {
            "creation" : {
                "sla" : "%s",
                "plan" : "%s",
                "proposed" : "%s"
            },
            "Pickup" : {
                "sla" : "%s",
                "plan" : "%s",
                "proposed" : "%s"
            }
        }
    }
    """
    js = []
    
    for item in df.values:
        js.append(json.loads(d%tuple(item.tolist())))
    
    print(json.dumps(js))
    

    输出:

    [{"Ord": "a", "MOT": "TR", "MVT": "TT", "CUST": "DEA", "milestone": {"creation": {"sla": "12-3-2020", "plan": "12-3-2020", "proposed": "12-3-2020"}, "Pickup": {"sla": "14-3-2020", "plan": "14-3-2020", "proposed": "14-3-2020"}}}, {"Ord": "b", "MOT": "ZR", "MVT": "TD", "CUST": "DET", "milestone": {"creation": {"sla": "15-3-2020", "plan": "15-3-2020", "proposed": "15-3-2020"}, "Pickup": {"sla": "16-3-2020", "plan": "16-3-2020", "proposed": "16-3-2020"}}}]
    

    【讨论】:

    • @Rahulrajan 你可以死json,我改了答案
    • 一个带有关键字参数link 的模板并将一个dict传递给它会比这更健壮
    【解决方案3】:

    既然你提到了 Pandas,我使用wide_to_long,然后使用groupby 来创建你的格式。请注意,这需要您在数据格式更改时更改level

    s=pd.wide_to_long(df,stubnames=['Creation','Pickup'],i=['Ord', 'MOT', 'MVT', 'CUST'],j='type' , suffix='\w+').stack().unstack(level=-2)
    js=[{**dict(zip(s.index.names[:-1],x)),**{'milestone' : y.reset_index(level=[0,1,2,3],drop=True).to_dict('i') }} for x , y in s.groupby(level=[0,1,2,3])]
    js
    [{'Ord': 12, 'MOT': 'TR', 'MVT': 'TT', 'CUST': 'DEA',
     'milestone':
      {'Creation':
      {'Planned': '12-3-2020', 'Proposed': '12-3-2020', 'Sla': '12-3-2020'}, 'Pickup': {'Planned': '14-3-2020', 'Proposed': '14-3-2020', 'Sla': '14-3-2020'}}},
     {'Ord': 15, 'MOT': 'ZR', 'MVT': 'TD', 'CUST': 'DET',
      'milestone':
      {'Creation':
      {'Planned': '15-3-2020', 'Proposed': '15-3-2020', 'Sla': '15-3-2020'}, 'Pickup': {'Planned': '16-3-2020', 'Proposed': '16-3-2020', 'Sla': '16-3-2020'}}}]
    

    【讨论】:

      猜你喜欢
      • 2019-11-24
      • 1970-01-01
      • 2020-12-22
      • 2018-06-01
      • 1970-01-01
      • 2018-06-14
      • 2018-07-23
      • 2016-12-21
      • 2021-04-20
      相关资源
      最近更新 更多