【问题标题】:how to create nested and custom json format for the datafarme如何为数据框创建嵌套和自定义 json 格式
【发布时间】:2022-10-13 15:41:36
【问题描述】:

我想从现有数据框创建子类别 数据框列包括(示例表)我在列级别需要的更改不是数据中的任何更改,例如一组列是列名和列名 3 个不同的后缀(很少有类似的列名和其他列名) 像这样的例子
|payer_id|payer_name|halo_payer_name|delta_payer_name|halo_desc|delta_desc|halo_operations|delta_notes|halo_processed_data|delta_processed_data|额外|insurance_company|
我希望它被分组到这个光环组 halo_payer_name|halo_desc|halo_operations|halo_processed_data|
我希望它被分组到这个增量组 delta_payer_name|delta_desc|delta_notes|delta_processed_data|
其余列为一组 所以当转换为 JSON 时,它会出现在这个布局中

{
    "schema": {
        "fields": [{
                "payer_details": [{
                        "name": "payer_id",
                        "type": "string"
                    },
                    {
                        "name": "payer_name",
                        "type": "string"
                    },
                    {
                        "name": "extra",
                        "type": "string"
                    },
                    {
                        "name": "insurance_company",
                        "type": "string"
                    }
                ]
            },
            {
                "halo": [{
                        "name": "halo_payer_name",
                        "type": "string"
                    },
                    {
                        "name": "halo_desc",
                        "type": "string"
                    },
                    {
                        "name": "halo_operstions",
                        "type": "string"
                    },
                    {
                        "name": "halo_processed_data",
                        "type": "string"
                    }
                ]
            }, {
                "delta": [{
                        "name": "delta_payer_name",
                        "type": "string"
                    },
                    {
                        "name": "delta_desc",
                        "type": "string"
                    },
                    {
                        "name": "delta_notes",
                        "type": "string"
                    },
                    {
                        "name": "delta_processed_data",
                        "type": "string"
                    }
                ]
            }
        ],
        "pandas_version": "1.4.0"
    },
    "masterdata": [{
        "payer_details": [{
            "payer_id": "",
            "payer_name": "",
            "extra": "",
            "insurance_company": ""
        }],
        "halo": [{
            "halo_payer_name": "",
            "halo_desc": "",
            "halo_operations": "",
            "halo_processed_data": "",
                    }],
        "delta":[{
            "delta_payer_name": "",
            "delta_desc": "",
            "delta_notes": "",
            "delta_processed_data": "",
                    }]
    }]
}

对于这种情况,我找不到解决方案,因为它是基于列的分组而不是基于数据的分组

【问题讨论】:

    标签: json dataframe group-by


    【解决方案1】:

    所以今天看到这篇文章对我的情况有所帮助(从数据框中添加数据并使用它来创建循环数据并将其插入到字典中,然后将整个转换为 JSON 文件) 对我有帮助的参考是link 所以这个问题的解决方案是这样的

    schema={
        "schema": {
            "fields": [{
                    "payer_details": [{
                            "name": "payer_id",
                            "type": "string"
                        },
                        {
                            "name": "payer_name",
                            "type": "string"
                        },
                        {
                            "name": "extra",
                            "type": "string"
                        },
                        {
                            "name": "insurance_company",
                            "type": "string"
                        }
                    ]
                },
                {
                    "halo": [{
                            "name": "halo_payer_name",
                            "type": "string"
                        },
                        {
                            "name": "halo_desc",
                            "type": "string"
                        },
                        {
                            "name": "halo_operstions",
                            "type": "string"
                        },
                        {
                            "name": "halo_processed_data",
                            "type": "string"
                        }
                    ]
                }, {
                    "delta": [{
                            "name": "delta_payer_name",
                            "type": "string"
                        },
                        {
                            "name": "delta_desc",
                            "type": "string"
                        },
                        {
                            "name": "delta_notes",
                            "type": "string"
                        },
                        {
                            "name": "delta_processed_data",
                            "type": "string"
                        }
                    ]
                }
            ],
            "pandas_version": "1.4.0"
        },
        "masterdata": []
    }
    


    根据我的需要派生了上面的模式

    payer_list=[]
    for i in df.index:
      case={
            "payer_details": [{
                "payer_id": "{}".format(df['payer_id'][i]),
                "payer_name": "{}".format(df['payer_name'][i]),
                "extra": "{}".format(df['extra'][i]),
                "insurance_company": "{}".format(df['insurance_company'][i])
            }],
            "halo": [{
                "halo_payer_name": "{}".format(df['halo_payer_name'][i]),
                "halo_desc": "{}".format(df['halo_desc'][i]),
                "halo_operations": "{}".format(df['halo_operations'][i]),
                "halo_processed_data": "{}".format(df['halo_processed_data'][i]),
                        }],
            "delta":[{
                "delta_payer_name": "{}".format(df['delta_payer_name'][i]),
                "delta_desc": "{}".format(df['delta_desc'][i]),
                "delta_notes": "{}".format(df['delta_notes'][i]),
                "delta_processed_data": "{}".format(df['delta_processed_data'][i]),
                        }]
        }
      payer_list.append(case)
    schema["masterdata"] = payer_list
    

    创建并清空列表并运行循环并包含在空列表中并加入或链接到架构

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2021-12-23
      • 2020-12-22
      • 2020-01-21
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多