【问题标题】:Converting large, ill-formatted .json file to csv将大型、格式错误的 .json 文件转换为 csv
【发布时间】:2015-02-26 23:23:32
【问题描述】:

我对 python 和 .json 文件的经验很少。我想将从其他人那里收到的大型 .json 文件转换为 .csv 文件以在 excel 中使用。

文件格式如下:

{
"bedrock": {
    "basinAge": "",
    "basinName": "",
    "basinSetting": "",
    "basinSource": "",
    "basinType": "",
    "division": "ATLANTICPLAIN",
    "primary": "ATLANTICPLAIN",
    "province": "COASTALPLAIN",
    "section": "MISSISSIPPIALLUVIALPLAIN"
},
"country": "US",
"county": "ButlerCounty",
"crow": {
    "PERIOD_RAN": "",
    "SITE_PERIO": "",
    "SURFACE_EL": "",
    "VS30_RANGE": "",
    "ZDRIFT": "",
    "ZONE": "",
    "ZPLEIS": "",
    "Zhol": "",
    "condition": "",
    "firmThickness": "",
    "geobed": "",
    "geodes": "",
    "geophone": "",
    "meas_type": "",
    "resonance": "",
    "sitelocation": "",
    "sitenumber": "",
    "sitevs30": "",
    "slope": "",
    "slopevel": "",
    "soilThickness": "",
    "veltofirm": "",
    "vs30": ""
},
"embaymentDepth": 27.176477284750096,
"file": "../../data\\anderson\\anderson-et-al-2003-MoDOT.json",
"geologicClass": "YNa",
"geology": "al",
"geologySource": "fullerton",
"lat": 36.790518,
"latlon": [
    [
        "36.7905",
        "-90.2025"
    ],
    "232.0000",
    0.00172447,
    "stable"
],
"location": "BridgeA-3709",
"lon": -90.202518,
"profile": {
    "entry": {
        "0": [
            0,
            146.185,
            "Empty"
        ],
        "1": [
            2.91874,
            194.378,
            "Empty"
        ],
        "2": [
            4.11277,
            228.112,
            "Empty"
        ],
        "3": [
            6.10282,
            221.687,
            "Empty"
        ],
        "4": [
            7.9602,
            221.687,
            "Empty"
        ],
        "5": [
            8.09287,
            220.08,
            "Empty"
        ],
        "6": [
            8.09287,
            216.867,
            "Empty"
        ],
        "7": [
            14.063,
            260.241,
            "Empty"
        ],
        "8": [
            18.0431,
            279.518,
            "Empty"
        ],
        "9": [
            22.1559,
            282.731,
            "Empty"
        ],
        "10": [
            26.0033,
            281.124,
            "Empty"
        ],
        "11": [
            29.9834,
            276.305,
            "Empty"
        ],
        "12": [
            36.0862,
            293.976,
            "Empty"
        ],
        "13": [
            41.9237,
            435.341,
            "Empty"
        ],
        "14": [
            48.0265,
            557.43,
            "Empty"
        ],
        "15": [
            54.1294,
            640.964,
            "Empty"
        ],
        "16": [
            59.8342,
            726.104,
            "Empty"
        ],
        "17": [
            68.1924,
            "Empty",
            "Empty"
        ]
    },
    "units": [
        "m",
        "m/s",
        "m/s"
    ]
},
"sedEnd": "",
"sedStack": "",
"sedStart": "",
"sedSubsurface": "",
"sedSurficial": "",
"sedVaneer": "",
"site": "SASW",
"state": "MO",
"terrain": "16",
"terrainvel": "246",
"vs30": {
    "profileListed": {
        "units": "",
        "value": "None"
    },
    "siteListed": {},
    "stationlisted": {
        "method": "",
        "units": "",
        "value": ""
    },
    "units": "m/s",
    "value": 232.2477304197259,
    "wald": "",
    "yong": ""
},
"vsz": [
    146.185,
    146.185,
    147.1733748014587,
    156.68616932663514,
    166.6758658515508,
    174.5091277144315,
    180.04135355419726,
    184.37079145300547,
    187.52878874899267,
    190.10050728694824,
    192.25770054815115,
    194.09311720243602,
    195.6737567374426,
    197.04922523230243,
    200.16214171750644,
    203.0924940564577,
    205.75028418598887,
    208.17185007705515,
    210.97976183739868,
    213.5984956154859,
    216.02447901610586,
    218.27823757348278,
    220.44997122220872,
    222.4921122273311,
    224.40458487503645,
    226.19935932262254,
    227.84822291575128,
    229.40085613024158,
    230.86555428020594,
    232.2477304197259,
    265.0897348970574,
    "",
    "",
    "",
    "",
    "",
    ""
]
}

有 1000 个条目,例如上面的条目,每个条目都有相同的键。在网上做了一些研究之后,我很确定我需要展平条目,但不知道如何以编程方式做到这一点。某些分类指标后面有一系列键(“基岩”、“乌鸦”等),必要时可以丢弃。

【问题讨论】:

  • 能否展示您期望的输出示例?
  • 严格来说,这看起来不像 json (u'' 不是 json 字符串定义),而是你已经从 json 导入后生成的 Python 数据。

标签: python json csv flatten


【解决方案1】:

毫无疑问,第一步是使用 JSON 解析器解析文件。然后编写代码查看结果字典并提取数据。

我不确定您所说的“格式错误”是什么意思;它看起来像有效的 JSON。如果您在使用 Python 的 json 模块解析它时遇到问题,您可以尝试使用 Python 的 yaml 模块来处理它。 YAML 是 JSON 的超集,但更能容忍不需要的逗号等小格式内容。

http://pymotw.com/2/json/

https://pypi.python.org/pypi/PyYAML

【讨论】:

  • 感谢您的帮助 - 我正在尝试使用 csv.writer 中的 writerow 函数,但它只需要一个参数。有没有办法附加从我的 JSON 字典中提取的数据。例如。我需要一个替代方案:with open('profile_database.csv','wb') as f: w=csv.writer(f) w.writerow(data['0'].keys(),data['0']['bedrock'].keys(),data['0']['crow'].keys(),data['0']['profile'].keys(),data['0']['profile']['entry'].keys(),data['0']['vs30'].keys(),data['0']['vs30']['stationlisted'].keys(),'profile')
  • 没关系,我意识到我可以创建一个变量。再次感谢!
猜你喜欢
  • 2018-07-14
  • 2018-08-15
  • 1970-01-01
  • 2013-09-16
  • 1970-01-01
  • 2023-02-07
  • 2020-05-11
  • 2015-03-26
  • 2018-03-06
相关资源
最近更新 更多