【问题标题】:Pandas json_normalize returns KeyErrorPandas json_normalize 返回 KeyError
【发布时间】:2021-03-26 05:54:54
【问题描述】:

我有一个来自 json 文件的数据集,格式如下:

data = {'data': {'content': [{'gender': 'Female',
    'id': 'covid-1004200003256',
    'state_code': '3272',
    'district_code': '3272040',
    'subdistrict_code': '3272040004',
    'latitude': -6.906,
    'longitude': 106.923,
    'state_name': 'KOTA SUKABUMI',
    'district_name': 'Gunungpuyuh',
    'subdistrict_name': 'Karamat',
    'stage': 'Isolated',
    'status': 'SUSPECT'},
   {'gender': 'Female',
    'id': 'covid-1004200003255',
    'state_code': '3272',
    'district_code': '3272040',
    'subdistrict_code': '3272040004',
    'latitude': -6.906,
    'longitude': 106.923,
    'state_name': 'KOTA SUKABUMI',
    'district_name': 'Gunungpuyuh',
    'subdistrict_name': 'Karamat',
    'stage': 'Isolated',
    'status': 'SUSPECT',
    }]}}

所以我想使用json_normalize制作一个数据框

df = pd.json_normalize(data, 'content')
df.head(10)

但它会返回:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-36-4d8ad8c8743a> in <module>()
----> 1 df = pd.json_normalize(data, 'content')
      2 df.head(10)

3 frames
/usr/local/lib/python3.6/dist-packages/pandas/io/json/_normalize.py in _json_normalize(data, record_path, meta, meta_prefix, record_prefix, errors, sep, max_level)
    334                 records.extend(recs)
    335 
--> 336     _recursive_extract(data, record_path, {}, level=0)
    337 
    338     result = DataFrame(records)

/usr/local/lib/python3.6/dist-packages/pandas/io/json/_normalize.py in _recursive_extract(data, path, seen_meta, level)
    307         else:
    308             for obj in data:
--> 309                 recs = _pull_records(obj, path[0])
    310                 recs = [
    311                     nested_to_record(r, sep=sep, max_level=max_level)

/usr/local/lib/python3.6/dist-packages/pandas/io/json/_normalize.py in _pull_records(js, spec)
    246         if has non iterable value.
    247         """
--> 248         result = _pull_field(js, spec)
    249 
    250         # GH 31507 GH 30145, GH 26284 if result is not list, raise TypeError if not

/usr/local/lib/python3.6/dist-packages/pandas/io/json/_normalize.py in _pull_field(js, spec)
    237                 result = result[field]
    238         else:
--> 239             result = result[spec]
    240         return result
    241 

KeyError: 'content'

任何想法如何解决这个问题?

【问题讨论】:

    标签: python json python-3.x pandas dataframe


    【解决方案1】:

    您的命令失败,因为您试图传递第二级嵌套键 (content)。您只能传递first 级别的嵌套键。

    所以,你需要传递data['data'],如下所示:

    In [934]: df = pd.json_normalize(data['data'], 'content')
    
    In [934]: df
    Out[934]: 
       gender                   id state_code district_code subdistrict_code  latitude  longitude     state_name district_name subdistrict_name     stage   status
    0  Female  covid-1004200003256       3272       3272040       3272040004    -6.906    106.923  KOTA SUKABUMI   Gunungpuyuh          Karamat  Isolated  SUSPECT
    1  Female  covid-1004200003255       3272       3272040       3272040004    -6.906    106.923  KOTA SUKABUMI   Gunungpuyuh          Karamat  Isolated  SUSPECT
    

    【讨论】:

      【解决方案2】:

      尝试直接传入记录数组:

      df = pd.json_normalize(data['data']['content'])
      

      【讨论】:

        猜你喜欢
        • 2018-08-19
        • 2021-08-23
        • 1970-01-01
        • 2015-11-24
        • 2019-07-13
        • 2020-08-19
        • 2017-03-15
        • 1970-01-01
        • 2019-04-21
        相关资源
        最近更新 更多