【问题标题】:How to remove a parent while parsing JSON into a pandas dataframe?如何在将 JSON 解析为 pandas 数据框时删除父级?
【发布时间】:2021-02-24 03:58:24
【问题描述】:

我想将来自以下 API 响应的数据解析为 pandas 数据帧。我猜这个 JSON 文件中有一个额外的父级导致了问题。如何删除它并正确解析数据?

网址:“https://api.covid19india.org/state_district_wise.json”

    import pandas as pd
    URL = "https://api.covid19india.org/state_district_wise.json"
    df = pd.read_json(URL)
    df.head()

上面的代码不起作用并给出了错误的输出。请帮忙。

【问题讨论】:

  • 你能指定你想要达到的输出吗?
  • 请参考我在下面接受的答案。这就是我所期待的输出。

标签: python json pandas api


【解决方案1】:

在 python 中解析嵌套结构很痛苦,这是适用于您的数据的解决方案:

import requests

URL = "https://api.covid19india.org/state_district_wise.json"
d = requests.get(URL).json()


L = []
for k, v in d.items():
    for k1, v1 in v.items():
        if isinstance(v1, dict):
            for k2, v2 in v1.items():
                if isinstance(v2, dict):
                    for k3, v3 in v2.items():
                        if isinstance(v3, dict):
                            d1 = {f'{k3}.{k4}': v4 for k4, v4 in v3.items()}
                            d2 = {'districtData':k,'State':k2,'statecode': v['statecode']}
                            d3 = {**d2, **v2, **d1}
                            del d3[k3]
                            L.append(d3)

df = pd.DataFrame(L)

print (df)

                    districtData                     State statecode  \
0               State Unassigned                Unassigned        UN   
1    Andaman and Nicobar Islands                  Nicobars        AN   
2    Andaman and Nicobar Islands  North and Middle Andaman        AN   
3    Andaman and Nicobar Islands             South Andaman        AN   
4    Andaman and Nicobar Islands                   Unknown        AN   
..                           ...                       ...       ...   
767                  West Bengal           Purba Bardhaman        WB   
768                  West Bengal           Purba Medinipur        WB   
769                  West Bengal                   Purulia        WB   
770                  West Bengal         South 24 Parganas        WB   
771                  West Bengal            Uttar Dinajpur        WB   

                                                 notes  active  confirmed  \
0                                                            0          0   
1    District-wise numbers are out-dated as cumulat...       0          0   
2    District-wise numbers are out-dated as cumulat...       0          1   
3    District-wise numbers are out-dated as cumulat...      19         51   
4                                                          148       4442   
..                                                 ...     ...        ...   
767                                                        618       8773   
768                                                       1424      16548   
769                                                        350       5609   
770                                                       1899      27445   
771                                                        358       5197   

     deceased  recovered  delta.confirmed  delta.deceased  delta.recovered  
0           0          0                0               0                0  
1           0          0                0               0                0  
2           0          1                0               0                0  
3           0         32                0               0                0  
4          60       4234                0               0                0  
..        ...        ...              ...             ...              ...  
767        74       8081                0               0                0  
768       212      14912                0               0                0  
769        33       5226                0               0                0  
770       501      25045                0               0                0  
771        55       4784                0               0                0  

[772 rows x 11 columns]

【讨论】:

  • 非常感谢您解决这个问题。我找到了相同数据的更简洁版本,但是这个解决方案帮助我获得了很多关于处理和解析不需要的 JSON 的见解。
猜你喜欢
  • 1970-01-01
  • 2020-01-24
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多