【问题标题】:(Closed) JSON to CSV conversion for LARGE DATASET(已关闭)大型数据集的 JSON 到 CSV 转换
【发布时间】:2016-01-30 05:14:33
【问题描述】:

我有一个 .txt 文件,其中包含超过一百万个 JSON 实体,其中包含从 python 程序生成的不同密钥。这只是一个例子。

{
    "category": "Athlete", 
    "website": "example.com", 
    "talking_about_count": 560, 
    "description": "xxx", 
    "id": "123"
}
{
    "category": "Community", 
    "talking_about_count": 0, 
    "name": "The Second Civil War",
    "likes": 26, 
    "id": "234", 
    "is_published": true
}

尽管每个 JSON 具有不同的属性,但它们确实具有共同的属性。 生成的 .csv 文件将包含列 category、website、talking_about_count、description、id、name、likes、is_published 像这样

"category","website","talking_about_count","name","likes","description","id","is_published"
"Athlete","example.com","560","","","xxx","123",""
"Community","","0","The Second Civil War","26","","234","True"

https://json-csv.com/ 做得很好,但无法处理超过 1000 个实体的数据集。

我想从这个包含一百万个 JSON 实体的 .txt 文件创建一个 CSV,我想知道是否有更好的方法来解决这个问题。

【问题讨论】:

  • 你可以试试他们的桌面转换器,它说它可以做任何大小的文件json-csv.com/download
  • @maxymoo 谢谢!我会试试看!

标签: json csv


【解决方案1】:

这是使用jq的解决方案

如果文件filter.jq 包含

  (reduce (.[]|keys_unsorted[]) as $k ({};.[$k]="")) as $o   # object with all keys
| ($o  | keys_unsorted), (.[] | $o * . | [.[]])              # generate header and data
| @csv                                                       # convert to csv

data.json 包含样本数据,然后是命令

jq -M -s -r -f filter.jq data.json

将产生输出

"category","website","talking_about_count","description","id","name","likes","is_published"
"Athlete","example.com",560,"xxx","123","","",""
"Community","",0,"","234","The Second Civil War",26,true

【讨论】:

    猜你喜欢
    • 2013-09-16
    • 2020-11-05
    • 1970-01-01
    • 2015-06-24
    • 1970-01-01
    • 2019-05-27
    • 2015-01-27
    • 2010-10-14
    相关资源
    最近更新 更多