如何将 JSON 列表转换为熊猫数据框？答案

【问题标题】：How to transform JSON SList to pandas dataframe?如何将 JSON 列表转换为熊猫数据框？
【发布时间】：2021-04-19 02:57:55
【问题描述】：

a = ['{"type": "book",', 
     '"title": "sometitle",', 
     '"author": [{"name": "somename"}],', 
     '"year": "2000",', 
     '"identifier": [{"type": "ISBN", "id": "1234567890"}],', 
     '"publisher": "somepublisher"}', '',
     '{"type": "book",', '
     '"title": "sometitle2",', 
     '"author": [{"name": "somename2"}],', 
     '"year": "2001",', 
     '"identifier": [{"type": "ISBN", "id": "1234567890"}],', 
     '"publisher": "somepublisher"}', '']

我有这个复杂的 SList，我希望最终将它放入一个整洁的 pandas 数据框中。

我尝试了很多方法，例如：

i = iter(a)
b = dict(zip(i, i))

不幸的是，这会创建一个看起来更糟糕的字典：

{'{"type": "book",':
...

以前我有一个 SList 字典，现在我有一个字典字典。

我也试过

pd.json_normalize(a)

但这会引发错误消息AttributeError: 'str' object has no attribute 'values'

我也试过

r = json.dumps(a.l)
loaded_r = json.loads(r)
print(loaded_r)

但这会产生一个列表

['{"type": "book",',
...

同样，最后我想要一个像这样的熊猫数据框

type   title       author     year ...

book   sometitle   somename   2000 ...
book   sometitle2 somename2   2001

显然，我还没有真正达到可以将数据提供给 pandas 函数的地步。每次我这样做时，函数都会对我尖叫......

【问题讨论】：

您的数据格式不正确
是的，我相信这是我的问题的核心部分。这就是我从别人的脚本中收到它的方式。
当我复制你的数据时，它返回了错误，可能是因为每行上的单引号。您能否测试您共享的示例数据是否有效，因为它最终会返回错误
我已将其更改为 MWE。现在有几行缺失，但有两个完整的观察结果。

标签： python pandas jupyter-notebook ipython

【解决方案1】：

a = ['{"type": "book",', 
     '"title": "sometitle",', 
     '"author": [{"name": "somename"}],', 
     '"year": "2000",', 
     '"identifier": [{"type": "ISBN", "id": "1234567890"}],', 
     '"publisher": "somepublisher"}', '',
     '{"type": "book",', 
     '"title": "sometitle2",', 
     '"author": [{"name": "somename2"}],', 
     '"year": "2001",', 
     '"identifier": [{"type": "ISBN", "id": "1234567890"}],', 
     '"publisher": "somepublisher"}', '']

b = "[%s]" % ''.join([',' if i == '' else i for i in a ]).strip(',')
data = json.loads(b)
df = pd.DataFrame(data)

print(df)

   type       title                   author  year  \
0  book   sometitle   [{'name': 'somename'}]  2000   
1  book  sometitle2  [{'name': 'somename2'}]  2001   

                               identifier      publisher  
0  [{'type': 'ISBN', 'id': '1234567890'}]  somepublisher  
1  [{'type': 'ISBN', 'id': '1234567890'}]  somepublisher

【讨论】：

也许它正好适合您的示例数据。