【问题标题】:python pandas - TypeError when parsing JSON: string indices must be integerspython pandas - 解析 JSON 时出现类型错误:字符串索引必须是整数
【发布时间】:2023-03-31 04:46:01
【问题描述】:

JSON 文件中的记录如下所示(请注意“营养素”是什么样的):

{
"id": 21441,
"description": "KENTUCKY FRIED CHICKEN, Fried Chicken, EXTRA CRISPY,
Wing, meat and skin with breading",
"tags": ["KFC"],
"manufacturer": "Kentucky Fried Chicken",
"group": "Fast Foods",
"portions": [
{
"amount": 1,
"unit": "wing, with skin",
"grams": 68.0
},
...
],
"nutrients": [
{
"value": 20.8,
"units": "g",
"description": "Protein",
"group": "Composition"
},
{'description': 'Total lipid (fat)',
'group': 'Composition',
'units': 'g',
'value': 29.2}
...
]
}

以下是书中练习的代码*。它包括一些争吵,并将每种食物的营养成分组合到一张大桌子中:

import pandas as pd
import json

db = pd.read_json("foods-2011-10-03.json")

nutrients = []

for rec in db:
     fnuts = pd.DataFrame(rec["nutrients"])
     fnuts["id"] = rec["id"]
     nutrients.append(fnuts)

但是,我收到以下错误,我无法弄清楚原因:


TypeError                                 Traceback (most recent call last)
<ipython-input-23-ac63a09efd73> in <module>()
      1 for rec in db:
----> 2     fnuts = pd.DataFrame(rec["nutrients"])
      3     fnuts["id"] = rec["id"]
      4     nutrients.append(fnuts)
      5

TypeError: string indices must be integers

*这是本书Python for Data Analysis中的一个例子

【问题讨论】:

标签: python json pandas typeerror


【解决方案1】:

for rec in db 迭代列名。要遍历行,

for id, rec in db.iterrows():
    fnuts = pd.DataFrame(rec["nutrients"])
    fnuts["id"] = rec["id"]
    nutrients.append(fnuts)

这有点慢(所有需要构建的字典)。 itertuples 更快;但由于您只关心两个系列,因此直接迭代系列可能是最快的:

for id, value in zip(db['id'], db['nutrients']):
    fnuts = pd.DataFrame(value)
    fnuts["id"] = id
    nutrients.append(fnuts)

【讨论】:

  • 谢谢,很好用!自本书编写以来,此迭代的工作方式是否发生了变化,还是应该将其添加到本书的勘误表中?
  • 对不起,我对 Pandas 的历史了解不多,也没有看过这本书。
【解决方案2】:

代码工作得很好,但json 应该看起来像这样,代码才能工作:

[{
"id": 21441,
"description": "KENTUCKY FRIED CHICKEN, Fried Chicken, EXTRA CRISPY,Wing, meat and skin with breading",
"tags": ["KFC"],
"manufacturer": "Kentucky Fried Chicken",
"group": "Fast Foods",
"portions": [
{"amount": 1,
"unit": "wing, with skin",
"grams": 68.0}],
"nutrients": [{
"value": 20.8,
"units": "g",
"description": "Protein",
"group": "Composition"
},
{'description': 'Total lipid (fat)',
'group': 'Composition',
'units': 'g',
'value': 29.2}]}]

这只是一个记录的例子。

【讨论】:

    【解决方案3】:

    Amadan 回答了这个问题,但在看到他的回答之前,我设法解决了这个问题:

    for i in range(len(db)):
        rec = db.loc[i]
        fnuts = pd.DataFrame(rec["nutrients"])
        fnuts["id"] = rec["id"]
        nutrients.append(fnuts)
    

    【讨论】:

      猜你喜欢
      • 2021-11-10
      • 2016-04-22
      • 1970-01-01
      • 2023-02-03
      • 1970-01-01
      • 2018-04-03
      • 2021-08-01
      • 1970-01-01
      • 2021-08-28
      相关资源
      最近更新 更多