【问题标题】:Export Nested JSON to CSV using Python使用 Python 将嵌套的 JSON 导出为 CSV
【发布时间】:2019-08-01 20:42:05
【问题描述】:

我有以下从 Xero 获得的 JSON 脚本。 这是一个嵌套的 JSON 脚本,我正在尝试创建一个平面表,然后将其导出为 CSV。

我已经编写了这个 python 代码,但我正在努力扁平化嵌套的 JSON 脚本。 最初我从 Xero 获取数据并使用 json.dumps 来序列化日期时间。此处显示的 JSON 导出来自 Postman 软件。当我使用 python 获取 JSON 脚本时,日期格式如下 'UpdatedDateUTC':datetime.datetime(2018, 10, 24, 12, 53, 55, 930000)。所以我使用 json.dumps 来序列化它。

当我产生第一个出口时:

df = pd.read_json(b_str)
df.to_csv(path+'invoices.csv')

CSV 文件如下所示:

下一步是将 ContactCreditNotes 列展平,并使其成为主表的一部分。因此,Contact 列将有 8 个新列:ContactID、ContactNumber、Name、Addresses、Phones、ContactGroups、ContactPersons、HasValidationErrors。 CreditNotes 列的类似流程

我试图在link 上复制该方法,但没有运气。我得到一个看起来像这样的出口。 contacts_with_id 数据框显示在多行而不是多列上。我不知道我做错了什么。

我也使用了 flatten_json 函数,但也没有运气。

我真的不需要让这种方法发挥作用。我只是想找到一种方法将嵌套的 json 脚本导出到可读的 csv 文件中。


Python 代码:

from xero import Xero
from xero.auth import PrivateCredentials
with open("E:\\privatekey.pem") as keyfile:
    rsa_key = keyfile.read()
credentials = PrivateCredentials('BHK1ZBEKIL4WM0BLKLIOT65PSIA43N', rsa_key)
xero = Xero(credentials)

import json
import pandas as pd
from pandas.io.json import json_normalize #package for flattening json in pandas df

# The following is a list
a_list = xero.invoices.all()

# The following is a string. Serialised Datetime
b_str = json.dumps(a_list, default=str)

path='E:\\MyDrive\\Python Workspaces\\'
df = pd.read_json(b_str)
df.to_csv(path+'invoices.csv')

# ********************* FLATTEN JSON *****************

dd = json.loads(b_str)

contacts_with_id = pd.io.json.json_normalize(dd, record_path='Contact', meta='InvoiceID',
                                    record_prefix='Contact.')

df_final = pd.merge(contacts_with_id, df, how='inner', on='InvoiceID')
df_final.to_csv(path+'invoices_final.csv')

Json 脚本如下:

{
"Id": "568d1686-7c53-4f22-a93f-754589a246a7",
"Status": "OK",
"ProviderName": "Rest API",
"DateTimeUTC": "/Date(1552234854959)/",
"Invoices": [
    {
        "Type": "ACCPAY",
        "InvoiceID": "8289ab9d-2134-4601-8622-e7fdae4b6d89",
        "InvoiceNumber": "10522",
        "Reference": "10522",
        "Payments": [],
        "CreditNotes": [],
        "Prepayments": [],
        "Overpayments": [],
        "AmountDue": 102,
        "AmountPaid": 0,
        "AmountCredited": 0,
        "CurrencyRate": 1,
        "HasErrors": false,
        "IsDiscounted": false,
        "HasAttachments": false,
        "Contact": {
            "ContactID": "d1dba397-0f0b-4819-a6ce-2839b7be5008",
            "ContactNumber": "c03bbcb5-fb0b-4f46-83f0-8687f754488b",
            "Name": "Micro",
            "Addresses": [],
            "Phones": [],
            "ContactGroups": [],
            "ContactPersons": [],
            "HasValidationErrors": false
        },
        "DateString": "2017-02-06T00:00:00",
        "Date": "/Date(1486339200000+0000)/",
        "DueDateString": "2017-03-08T00:00:00",
        "DueDate": "/Date(1488931200000+0000)/",
        "Status": "AUTHORISED",
        "LineAmountTypes": "Exclusive",
        "LineItems": [],
        "SubTotal": 85,
        "TotalTax": 17,
        "Total": 102,
        "UpdatedDateUTC": "/Date(1529940362110+0000)/",
        "CurrencyCode": "GBP"
    },
    {
        "Type": "ACCREC",
        "InvoiceID": "9e37150f-88a5-4213-a085-b30c5e01c2bf",
        "InvoiceNumber": "(13)",
        "Reference": "",
        "Payments": [],
        "CreditNotes": [
            {
                "CreditNoteID": "3c5c7dec-534a-46e0-ad1b-f0f69822cfd5",
                "CreditNoteNumber": "(12)",
                "ID": "3c5c7dec-534a-46e0-ad1b-f0f69822cfd5",
                "AppliedAmount": 1200,
                "DateString": "2011-05-04T00:00:00",
                "Date": "/Date(1304467200000+0000)/",
                "LineItems": [],
                "Total": 7800
            },
            {
                "CreditNoteID": "af38e37f-4ba3-4208-a193-a32b418c2bbc",
                "CreditNoteNumber": "(14)",
                "ID": "af38e37f-4ba3-4208-a193-a32b418c2bbc",
                "AppliedAmount": 2600,
                "DateString": "2011-05-04T00:00:00",
                "Date": "/Date(1304467200000+0000)/",
                "LineItems": [],
                "Total": 2600
            }
        ],
        "Prepayments": [],
        "Overpayments": [],
        "AmountDue": 0,
        "AmountPaid": 0,
        "AmountCredited": 3800,
        "CurrencyRate": 1,
        "HasErrors": false,
        "IsDiscounted": false,
        "HasAttachments": false,
        "Contact": {
            "ContactID": "58164bd6-5225-4f30-ad89-35140db5b624",
            "ContactNumber": "d0b420b8-4a58-40d1-9717-8525edda7658",
            "Name": "FSales (1)",
            "Addresses": [],
            "Phones": [],
            "ContactGroups": [],
            "ContactPersons": [],
            "HasValidationErrors": false
        },
        "DateString": "2011-05-04T00:00:00",
        "Date": "/Date(1304467200000+0000)/",
        "DueDateString": "2011-06-03T00:00:00",
        "DueDate": "/Date(1307059200000+0000)/",
        "Status": "PAID",
        "LineAmountTypes": "Exclusive",
        "LineItems": [],
        "SubTotal": 3166.67,
        "TotalTax": 633.33,
        "Total": 3800,
        "UpdatedDateUTC": "/Date(1529943661150+0000)/",
        "CurrencyCode": "GBP",
        "FullyPaidOnDate": "/Date(1304467200000+0000)/"
    },
    {
        "Type": "ACCPAY",
        "InvoiceID": "1ddea7ec-a0d5-457a-a8fd-cfcdc2099d51",
        "InvoiceNumber": "01596057543",
        "Reference": "",
        "Payments": [
            {
                "PaymentID": "fd639da3-c009-47df-a4bf-98ccd5c68e43",
                "Date": "/Date(1551657600000+0000)/",
                "Amount": 173.86,
                "Reference": "",
                "CurrencyRate": 1,
                "HasAccount": false,
                "HasValidationErrors": false
            }
        ],
        "CreditNotes": [],
        "Prepayments": [],
        "Overpayments": [],
        "AmountDue": 0,
        "AmountPaid": 173.86,
        "AmountCredited": 0,
        "CurrencyRate": 1,
        "HasErrors": false,
        "IsDiscounted": false,
        "HasAttachments": true,
        "Contact": {
            "ContactID": "309afb74-0a3b-4d68-85e8-2259ca5acd13",
            "ContactNumber": "91eef1f0-5fe6-45d7-b739-1ab5352a5523",
            "Name": "Company AAA",
            "Addresses": [],
            "Phones": [],
            "ContactGroups": [],
            "ContactPersons": [],
            "HasValidationErrors": false
        },
        "DateString": "2019-02-23T00:00:00",
        "Date": "/Date(1550880000000+0000)/",
        "DueDateString": "2019-03-21T00:00:00",
        "DueDate": "/Date(1553126400000+0000)/",
        "Status": "PAID",
        "LineAmountTypes": "Exclusive",
        "LineItems": [],
        "SubTotal": 144.88,
        "TotalTax": 28.98,
        "Total": 173.86,
        "UpdatedDateUTC": "/Date(1551777481907+0000)/",
        "CurrencyCode": "GBP",
        "FullyPaidOnDate": "/Date(1551657600000+0000)/"
    },
    {
        "Type": "ACCPAY",
        "InvoiceID": "ba5ff3b1-1058-4645-80da-5475c23da949",
        "InvoiceNumber": "Q0603",
        "Reference": "",
        "Payments": [],
        "CreditNotes": [],
        "Prepayments": [],
        "Overpayments": [],
        "AmountDue": 213.24,
        "AmountPaid": 0,
        "AmountCredited": 0,
        "CurrencyRate": 1,
        "HasErrors": false,
        "IsDiscounted": false,
        "HasAttachments": true,
        "Contact": {
            "ContactID": "f0473b41-da92-4397-9d2c-741812f2475c",
            "ContactNumber": "1f124969-de8d-40b8-8140-d4997511b0dc",
            "Name": "BTelcom",
            "Addresses": [],
            "Phones": [],
            "ContactGroups": [],
            "ContactPersons": [],
            "HasValidationErrors": false
        },
        "DateString": "2019-03-05T00:00:00",
        "Date": "/Date(1551744000000+0000)/",
        "DueDateString": "2019-03-21T00:00:00",
        "DueDate": "/Date(1553126400000+0000)/",
        "Status": "SUBMITTED",
        "LineAmountTypes": "Exclusive",
        "LineItems": [],
        "SubTotal": 177.7,
        "TotalTax": 35.54,
        "Total": 213.24,
        "UpdatedDateUTC": "/Date(1552068778417+0000)/",
        "CurrencyCode": "GBP"
    }
]

}

【问题讨论】:

    标签: python json pandas nested


    【解决方案1】:

    我以前不得不这样做:

    基本上展平了整个嵌套的 json,然后遍历这些列(它使用一种模式来包含它将被构建到表中的行)以创建新行。

    有 4 张发票,这会创建 4 行(针对每张发票)。希望这就是您正在寻找的。​​p>

    注意您可能会遇到一些问题:

    如果尝试展平包含嵌套列表且嵌套列表具有不同长度的 json 文件,则需要考虑的事项之一是,只要单行对任何给定列具有 ONE 值,就必须创建该列即使所有其他行都为空。在那个Payments 键中,有带有额外7 个元素的列表。因此,如果某些 ID 有 8 次付款(而其他所有 ID 只有 1 次付款),则必须创建 56 个额外的列来将所有这些列存储在单独的列/平面文件中。

    jsonStr = '''{
    "Id": "568d1686-7c53-4f22-a93f-754589a246a7",
    "Status": "OK",
    "ProviderName": "Rest API",
    "DateTimeUTC": "/Date(1552234854959)/",
    "Invoices": [
        {
            "Type": "ACCPAY",
            "InvoiceID": "8289ab9d-2134-4601-8622-e7fdae4b6d89",
            "InvoiceNumber": "10522",
            "Reference": "10522",
            "Payments": [],
            "CreditNotes": [],
            "Prepayments": [],
            "Overpayments": [],
            "AmountDue": 102,
            "AmountPaid": 0,
            "AmountCredited": 0,
            "CurrencyRate": 1,
            "HasErrors": false,
            "IsDiscounted": false,
            "HasAttachments": false,
            "Contact": {
                "ContactID": "d1dba397-0f0b-4819-a6ce-2839b7be5008",
                "ContactNumber": "c03bbcb5-fb0b-4f46-83f0-8687f754488b",
                "Name": "Micro",
                "Addresses": [],
                "Phones": [],
                "ContactGroups": [],
                "ContactPersons": [],
                "HasValidationErrors": false
            },
            "DateString": "2017-02-06T00:00:00",
            "Date": "/Date(1486339200000+0000)/",
            "DueDateString": "2017-03-08T00:00:00",
            "DueDate": "/Date(1488931200000+0000)/",
            "Status": "AUTHORISED",
            "LineAmountTypes": "Exclusive",
            "LineItems": [],
            "SubTotal": 85,
            "TotalTax": 17,
            "Total": 102,
            "UpdatedDateUTC": "/Date(1529940362110+0000)/",
            "CurrencyCode": "GBP"
        },
        {
            "Type": "ACCREC",
            "InvoiceID": "9e37150f-88a5-4213-a085-b30c5e01c2bf",
            "InvoiceNumber": "(13)",
            "Reference": "",
            "Payments": [],
            "CreditNotes": [
                {
                    "CreditNoteID": "3c5c7dec-534a-46e0-ad1b-f0f69822cfd5",
                    "CreditNoteNumber": "(12)",
                    "ID": "3c5c7dec-534a-46e0-ad1b-f0f69822cfd5",
                    "AppliedAmount": 1200,
                    "DateString": "2011-05-04T00:00:00",
                    "Date": "/Date(1304467200000+0000)/",
                    "LineItems": [],
                    "Total": 7800
                },
                {
                    "CreditNoteID": "af38e37f-4ba3-4208-a193-a32b418c2bbc",
                    "CreditNoteNumber": "(14)",
                    "ID": "af38e37f-4ba3-4208-a193-a32b418c2bbc",
                    "AppliedAmount": 2600,
                    "DateString": "2011-05-04T00:00:00",
                    "Date": "/Date(1304467200000+0000)/",
                    "LineItems": [],
                    "Total": 2600
                }
            ],
            "Prepayments": [],
            "Overpayments": [],
            "AmountDue": 0,
            "AmountPaid": 0,
            "AmountCredited": 3800,
            "CurrencyRate": 1,
            "HasErrors": false,
            "IsDiscounted": false,
            "HasAttachments": false,
            "Contact": {
                "ContactID": "58164bd6-5225-4f30-ad89-35140db5b624",
                "ContactNumber": "d0b420b8-4a58-40d1-9717-8525edda7658",
                "Name": "FSales (1)",
                "Addresses": [],
                "Phones": [],
                "ContactGroups": [],
                "ContactPersons": [],
                "HasValidationErrors": false
            },
            "DateString": "2011-05-04T00:00:00",
            "Date": "/Date(1304467200000+0000)/",
            "DueDateString": "2011-06-03T00:00:00",
            "DueDate": "/Date(1307059200000+0000)/",
            "Status": "PAID",
            "LineAmountTypes": "Exclusive",
            "LineItems": [],
            "SubTotal": 3166.67,
            "TotalTax": 633.33,
            "Total": 3800,
            "UpdatedDateUTC": "/Date(1529943661150+0000)/",
            "CurrencyCode": "GBP",
            "FullyPaidOnDate": "/Date(1304467200000+0000)/"
        },
        {
            "Type": "ACCPAY",
            "InvoiceID": "1ddea7ec-a0d5-457a-a8fd-cfcdc2099d51",
            "InvoiceNumber": "01596057543",
            "Reference": "",
            "Payments": [
                {
                    "PaymentID": "fd639da3-c009-47df-a4bf-98ccd5c68e43",
                    "Date": "/Date(1551657600000+0000)/",
                    "Amount": 173.86,
                    "Reference": "",
                    "CurrencyRate": 1,
                    "HasAccount": false,
                    "HasValidationErrors": false
                }
            ],
            "CreditNotes": [],
            "Prepayments": [],
            "Overpayments": [],
            "AmountDue": 0,
            "AmountPaid": 173.86,
            "AmountCredited": 0,
            "CurrencyRate": 1,
            "HasErrors": false,
            "IsDiscounted": false,
            "HasAttachments": true,
            "Contact": {
                "ContactID": "309afb74-0a3b-4d68-85e8-2259ca5acd13",
                "ContactNumber": "91eef1f0-5fe6-45d7-b739-1ab5352a5523",
                "Name": "Company AAA",
                "Addresses": [],
                "Phones": [],
                "ContactGroups": [],
                "ContactPersons": [],
                "HasValidationErrors": false
            },
            "DateString": "2019-02-23T00:00:00",
            "Date": "/Date(1550880000000+0000)/",
            "DueDateString": "2019-03-21T00:00:00",
            "DueDate": "/Date(1553126400000+0000)/",
            "Status": "PAID",
            "LineAmountTypes": "Exclusive",
            "LineItems": [],
            "SubTotal": 144.88,
            "TotalTax": 28.98,
            "Total": 173.86,
            "UpdatedDateUTC": "/Date(1551777481907+0000)/",
            "CurrencyCode": "GBP",
            "FullyPaidOnDate": "/Date(1551657600000+0000)/"
        },
        {
            "Type": "ACCPAY",
            "InvoiceID": "ba5ff3b1-1058-4645-80da-5475c23da949",
            "InvoiceNumber": "Q0603",
            "Reference": "",
            "Payments": [],
            "CreditNotes": [],
            "Prepayments": [],
            "Overpayments": [],
            "AmountDue": 213.24,
            "AmountPaid": 0,
            "AmountCredited": 0,
            "CurrencyRate": 1,
            "HasErrors": false,
            "IsDiscounted": false,
            "HasAttachments": true,
            "Contact": {
                "ContactID": "f0473b41-da92-4397-9d2c-741812f2475c",
                "ContactNumber": "1f124969-de8d-40b8-8140-d4997511b0dc",
                "Name": "BTelcom",
                "Addresses": [],
                "Phones": [],
                "ContactGroups": [],
                "ContactPersons": [],
                "HasValidationErrors": false
            },
            "DateString": "2019-03-05T00:00:00",
            "Date": "/Date(1551744000000+0000)/",
            "DueDateString": "2019-03-21T00:00:00",
            "DueDate": "/Date(1553126400000+0000)/",
            "Status": "SUBMITTED",
            "LineAmountTypes": "Exclusive",
            "LineItems": [],
            "SubTotal": 177.7,
            "TotalTax": 35.54,
            "Total": 213.24,
            "UpdatedDateUTC": "/Date(1552068778417+0000)/",
            "CurrencyCode": "GBP"
        }
    ]
    }'''
    
    
    
    import json
    import pandas as pd
    import re
    
    def flatten_json(y):
        out = {}
        def flatten(x, name=''):
            if type(x) is dict:
                for a in x:
                    flatten(x[a], name + a + '_')
            elif type(x) is list:
                i = 0
                for a in x:
                    flatten(a, name + str(i) + '_')
                    i += 1
            else:
                out[name[:-1]] = x
        flatten(y)
        return out
    
    jsonObj = json.loads(jsonStr)
    flat = flatten_json(jsonObj)
    
    results = pd.DataFrame()
    special_cols = []
    
    columns_list = list(flat.keys())
    for item in columns_list:
        try:
            row_idx = re.findall(r'\_(\d+)\_', item )[0]
        except:
            special_cols.append(item)
            continue
        column = re.findall(r'\_\d+\_(.*)', item )[0]
        column = column.replace('_', '')
    
        row_idx = int(row_idx)
        value = flat[item]
    
        results.loc[row_idx, column] = value
    
    for item in special_cols:
        results[item] = flat[item]
    

    输出:

    print (results.to_string())
         Type                             InvoiceID InvoiceNumber Reference  AmountDue  AmountPaid  AmountCredited  CurrencyRate  HasErrors  IsDiscounted  HasAttachments                      ContactContactID                  ContactContactNumber  ContactName  ContactHasValidationErrors           DateString                        Date        DueDateString                     DueDate Status LineAmountTypes  SubTotal  TotalTax    Total              UpdatedDateUTC CurrencyCode              CreditNotes0CreditNoteID CreditNotes0CreditNoteNumber                        CreditNotes0ID  CreditNotes0AppliedAmount CreditNotes0DateString            CreditNotes0Date  CreditNotes0Total              CreditNotes1CreditNoteID CreditNotes1CreditNoteNumber                        CreditNotes1ID  CreditNotes1AppliedAmount CreditNotes1DateString            CreditNotes1Date  CreditNotes1Total             FullyPaidOnDate                    Payments0PaymentID               Payments0Date  Payments0Amount Payments0Reference  Payments0CurrencyRate Payments0HasAccount Payments0HasValidationErrors                                    Id ProviderName            DateTimeUTC
    0  ACCPAY  8289ab9d-2134-4601-8622-e7fdae4b6d89         10522     10522     102.00        0.00             0.0           1.0      False         False           False  d1dba397-0f0b-4819-a6ce-2839b7be5008  c03bbcb5-fb0b-4f46-83f0-8687f754488b        Micro                       False  2017-02-06T00:00:00  /Date(1486339200000+0000)/  2017-03-08T00:00:00  /Date(1488931200000+0000)/     OK       Exclusive     85.00     17.00   102.00  /Date(1529940362110+0000)/          GBP                                   NaN                          NaN                                   NaN                        NaN                    NaN                         NaN                NaN                                   NaN                          NaN                                   NaN                        NaN                    NaN                         NaN                NaN                         NaN                                   NaN                         NaN              NaN                NaN                    NaN                 NaN                          NaN  568d1686-7c53-4f22-a93f-754589a246a7     Rest API  /Date(1552234854959)/
    1  ACCREC  9e37150f-88a5-4213-a085-b30c5e01c2bf          (13)                 0.00        0.00          3800.0           1.0      False         False           False  58164bd6-5225-4f30-ad89-35140db5b624  d0b420b8-4a58-40d1-9717-8525edda7658   FSales (1)                       False  2011-05-04T00:00:00  /Date(1304467200000+0000)/  2011-06-03T00:00:00  /Date(1307059200000+0000)/     OK       Exclusive   3166.67    633.33  3800.00  /Date(1529943661150+0000)/          GBP  3c5c7dec-534a-46e0-ad1b-f0f69822cfd5                         (12)  3c5c7dec-534a-46e0-ad1b-f0f69822cfd5                     1200.0    2011-05-04T00:00:00  /Date(1304467200000+0000)/             7800.0  af38e37f-4ba3-4208-a193-a32b418c2bbc                         (14)  af38e37f-4ba3-4208-a193-a32b418c2bbc                     2600.0    2011-05-04T00:00:00  /Date(1304467200000+0000)/             2600.0  /Date(1304467200000+0000)/                                   NaN                         NaN              NaN                NaN                    NaN                 NaN                          NaN  568d1686-7c53-4f22-a93f-754589a246a7     Rest API  /Date(1552234854959)/
    2  ACCPAY  1ddea7ec-a0d5-457a-a8fd-cfcdc2099d51   01596057543                 0.00      173.86             0.0           1.0      False         False            True  309afb74-0a3b-4d68-85e8-2259ca5acd13  91eef1f0-5fe6-45d7-b739-1ab5352a5523  Company AAA                       False  2019-02-23T00:00:00  /Date(1550880000000+0000)/  2019-03-21T00:00:00  /Date(1553126400000+0000)/     OK       Exclusive    144.88     28.98   173.86  /Date(1551777481907+0000)/          GBP                                   NaN                          NaN                                   NaN                        NaN                    NaN                         NaN                NaN                                   NaN                          NaN                                   NaN                        NaN                    NaN                         NaN                NaN  /Date(1551657600000+0000)/  fd639da3-c009-47df-a4bf-98ccd5c68e43  /Date(1551657600000+0000)/           173.86                                       1.0               False                        False  568d1686-7c53-4f22-a93f-754589a246a7     Rest API  /Date(1552234854959)/
    3  ACCPAY  ba5ff3b1-1058-4645-80da-5475c23da949         Q0603               213.24        0.00             0.0           1.0      False         False            True  f0473b41-da92-4397-9d2c-741812f2475c  1f124969-de8d-40b8-8140-d4997511b0dc      BTelcom                       False  2019-03-05T00:00:00  /Date(1551744000000+0000)/  2019-03-21T00:00:00  /Date(1553126400000+0000)/     OK       Exclusive    177.70     35.54   213.24  /Date(1552068778417+0000)/          GBP                                   NaN                          NaN                                   NaN                        NaN                    NaN                         NaN                NaN                                   NaN                          NaN                                   NaN                        NaN                    NaN                         NaN                NaN                         NaN                                   NaN                         NaN              NaN                NaN                    NaN                 NaN                          NaN  568d1686-7c53-4f22-a93f-754589a246a7     Rest API  /Date(1552234854959)/
    

    【讨论】:

    • 非常感谢@chitown88 的回复。非常接近我的需求。当我为整个 JSON 文件运行它时,我只得到 15 行,输出消耗了 Excel 的所有可用列,并显示“文件未成功加载”。我从这个语句中得到输出: b_str = json.dumps(a_list, default=str) 。任何想法为什么会发生这种情况。是json.dumps改变了json的初始格式吗?
    猜你喜欢
    • 2020-11-21
    • 2019-07-24
    • 2017-06-10
    • 1970-01-01
    • 2018-10-27
    • 2021-05-17
    • 1970-01-01
    • 2013-10-20
    • 1970-01-01
    相关资源
    最近更新 更多