【问题标题】:Parsing JSON to CSV using Python: AttributeError: 'unicode' object has no attribute 'keys'使用 Python 将 JSON 解析为 CSV:AttributeError: 'unicode' object has no attribute 'keys'
【发布时间】:2017-06-06 03:36:14
【问题描述】:

我有一个嵌套的 JSON 数据集,其中包含多个条目,如下所示:

{
"coordinates": null,
"acoustic_features": {
    "instrumentalness": "0.00479",
    "liveness": "0.18",
    "speechiness": "0.0294",
    "danceability": "0.634",
    "valence": "0.342",
    "loudness": "-8.345",
    "tempo": "125.044",
    "acousticness": "0.00035",
    "energy": "0.697",
    "mode": "1",
    "key": "6"
},
"artist_id": "b2980c722a1ace7a30303718ce5491d8",
"place": null,
"geo": null,
"tweet_lang": "en",
"source": "Share.Radionomy.com",
"track_title": "8eeZ",
"track_id": "cd52b3e5b51da29e5893dba82a418a4b",
"artist_name": "Dominion",
"entities": {
    "hashtags": [{
        "text": "nowplaying",
        "indices": [0, 11]
    }, {
        "text": "goth",
        "indices": [51, 56]
    }, {
        "text": "deathrock",
        "indices": [57, 67]
    }, {
        "text": "postpunk",
        "indices": [68, 77]
    }],
    "symbols": [],
    "user_mentions": [],
    "urls": [{
        "indices": [28, 50],
        "expanded_url": "cathedral13.com/blog13",
        "display_url": "cathedral13.com/blog13",
        "url": "t.co/Tatf4hEVkv"
    }]
},
"created_at": "2014-01-01 05:54:21",
"text": "#nowplaying Dominion - 8eeZ Tatf4hEVkv #goth #deathrock #postpunk",
"user": {
    "location": "middle of nowhere",
    "lang": "en",
    "time_zone": "Central Time (US & Canada)",
    "name": "Cathedral 13",
    "entities": null,
    "id": 81496937,
    "description": "I\u2019m a music junkie who is currently responsible for Cathedral 13 internet radio (goth, deathrock, post-punk)which has been online since 06/20/02."
},
"id": 418243774842929150
}

我想将它转换为一个 csv 文件,其中有多个列,其中包含每个 JSON 对象的相应条目。以下是我编写的 Python 代码:

import json
import csv
from pprint import pprint
data = []
with open('data_subset.json') as data_file:
    for line in data_file:
        data.append(json.loads(line))

# open a file for writing
data_csv = open('Data_csv.csv', 'w')
# create the csv writer object
csvwriter = csv.writer(data_csv)

for i in range(1,10):
    count = 0
    for dat in data[i]:
        if count == 0:
             header = dat.keys()
             csvwriter.writerow(header)
             count += 1
        csvwriter.writerow(emp.values())
data_csv.close()

在运行上述代码时,我收到错误:AttributeError: 'unicode' object has no attribute 'keys'。 可能是什么问题?

【问题讨论】:

    标签: python json csv nested


    【解决方案1】:

    您可以像这样一次性读取 JSON 文件:

    with open('a.txt') as data_file:    
        data = json.load(data_file)
    

    现在您将 JSON 作为 data 字典。

    由于您需要从 JSON 到 csv 的特定条目(例如,entities 不保存到 csv),您可以保留自定义列标题,然后循环遍历数据以将特定键写入 csv:

    # Example to save the artist_id and user id; can be extended for the actual data
    header = ['artist_id', 'id']
    
    # open a file for writing
    data_csv = open('Data_csv.csv', 'wb')
    
    # create the csv writer object
    csvwriter = csv.writer(data_csv)
    
    # write the csv header
    csvwriter.writerow(header)
    
    for entry in data:
        csvwriter.writerow([entry['artist_id'], entry['user']['id']])
    
    data_csv.close()
    

    【讨论】:

    • 实际的 json 文件有 10000 个上述格式的条目。所以我想我需要遍历 JSON 对象并将它们存储在一个数组中。我希望 csv 文件具有如下列:{坐标、器乐、活跃度、演讲能力、舞蹈能力、效价、响度、节奏、声学、能量、模式、键、艺术家 ID、地点、地理、推文语言、来源、轨道标题、轨道 ID , artist_name, created_at, text, location, lang, time_zone, name, entity, id, description} 此外,由主题标签组成的实体可以具有可变数量的文本和索引字段。
    • @AsmitaPoddar,我已根据您的输入更新了答案。您可以从 json 添加其他字段以将它们写入 csv。
    • 如果我想在我的 csv 文件中添加主题标签列怎么办?
    • 如果我想在我的 csv 文件中添加主题标签怎么办?
    • 主题标签是什么意思?你能展示一个示例行吗?
    猜你喜欢
    • 1970-01-01
    • 2018-01-31
    • 2017-10-21
    • 2022-11-20
    • 2022-12-01
    • 1970-01-01
    • 2018-02-05
    • 2015-11-13
    • 2022-01-19
    相关资源
    最近更新 更多