【发布时间】:2021-11-15 15:03:37
【问题描述】:
我正在尝试了解如何使用 Python 过滤 json 数据 我的 json 看起来像这样:
[
{
"comments_full": []
},
{
"comments_full": [
{
"comment_id": "433934735000014",
"comment_url": "https:\\/\\/facebook.com\\/433934735000014",
"commenter_id": "100002886314120",
"commenter_url": "https:\\/\\/facebook.com\\/loubnaharifi?fref=nf&rc=p&refid=52&__tn__=R",
"commenter_name": "Loubna Harifi",
"commenter_meta": null,
"comment_text": "\\u00c0 18h \\u00e7a commence",
"comment_time": 1636502400000,
"comment_image": null,
"comment_reactors": [
{
"name": "Bouygues Telecom",
"link": "https:\\/\\/facebook.com\\/bouyguestelecom\\/?fref=pb",
"type": "like"
}
],
"comment_reactions": {
"like": 55,
"love": 12,
"haha": 4,
"wow": 1,
"sad": 1,
"angry": 4
},
"comment_reaction_count": 77,
"replies": [
{
"comment_id": "433935588333262",
"comment_url": "https:\\/\\/facebook.com\\/433935588333262",
"commenter_id": "94533530492",
"commenter_url": "https:\\/\\/facebook.com\\/bouyguestelecom\\/?rc=p&refid=52&__tn__=%7ERR",
"commenter_name": "Bouygues Telecom",
"commenter_meta": null,
"comment_text": "Oui tout \\u00e0 fait ! RDV \\u00e0 18h \\ud83d\\ude42",
"comment_time": 1636502400000,
"comment_image": null,
"comment_reactors": [
{
"name": "Maryline Moss",
"link": "https:\\/\\/facebook.com\\/mary.poilue.92?fref=pb",
"type": "like"
},
{
"name": "Jess Robic",
"link": "https:\\/\\/facebook.com\\/JessicaRbc91?fref=pb",
"type": "like"
}
],
"comment_reactions": {
"like": 55,
"love": 12,
"haha": 4,
"wow": 1,
"sad": 1,
"angry": 4
},
"comment_reaction_count": 77
...
我要提取的是:
- comment_id
- 评论者姓名
- comment_text
这是我目前尝试过的:
df_ori[["comments_full"]].to_excel(r'C:/Users/stefa/OneDrive/Bureau/Scrap website/Last test/Scrapped_FB.xlsx', index = None, header=True)
cSvFilePath = "C:/Users/stefa/OneDrive/Bureau/Scrap website/Last test/Scrapped_FB.csv"
jsonFilePath = "C:/Users/stefa/OneDrive/Bureau/Scrap website/Last test/Scrapped_FB.json"
# Read the CSV and add the data to a diction
data = {}
with open(cSvFilePath, encoding="cp437", errors='ignore') as csvFile:
csvReader = csv.DictReader(csvFile)
for csvRow in csvReader:
hmid = csvRow["comment_text"]
data[hmid] = csvRow
file = dataframe(data, columns= ['comments_full'])
file.to_json(r'C:/Users/stefa/OneDrive/Bureau/Scrap website/Last test/Scrapped_FB.json',orient='split')
【问题讨论】:
-
请发布有效的
json文件,即使用正确的括号而不是末尾的...。另外,您当前的代码有什么错误/问题? -
Stack Overflow 不是教程服务。有成千上万的questions on this site 处理在 Python 中解析 JSON 并从中提取数据,您应该在提出新问题之前先阅读这些数据。还有成千上万的tutorial websites 可以处理这个确切的主题。