【问题标题】:Transform JSON file to reduce levels转换 JSON 文件以降低级别
【发布时间】:2019-07-04 19:14:53
【问题描述】:

我收到一个带有一些测试结果的 JSON 文件。数据层次结构是:[{date, [ test -> {time, result}]}],我们需要转换为更“可用”的东西,比如{date&time, (test: result)}

我们正在使用此代码:

import json

with open('test_result.json', 'r') as f:
    main_struct = json.load(f)

transformed = {}

main_date = main_struct.get('date')

for main_key, main_value in main_struct.items():
    if isinstance(main_value, list):
        for inner_key in main_value:
            transform_key = main_date + 'T' + inner_key.get('time')
            if transform_key not in transformed:
                transformed[transform_key] = {}
            for innest_key, innest_value in inner_key.items():
                if innest_key == 'value':
                    transfom_inner_key = main_key
                else:
                    transfom_inner_key = main_key + "." + innest_key
                if innest_key != 'time':
                    transformed[transform_key][transfom_inner_key] = innest_value
    else:
        transformed[main_key] = main_value

with open('transform.json', 'w') as outfile:
    json.dump(transformed, outfile, sort_keys=True, indent=4)

尽管代码正在运行,但我对可读性有些担忧,并且可能存在一些有助于降低代码复杂性的库。

这是 JSON 输入文件的内容:

   [
    {
       "date": "2019-05-19",
       "test1": [
           { "time": "14:00:00", "value": 10 },
           { "time": "15:00:00", "value": 12 },
           { "time": "17:00:00", "value": 16 }
        ],
       "test2": [
           { "time": "14:00:00", "value": 11 },
           { "time": "16:00:00", "value": 15 },
           { "time": "17:00:00", "value": 17 }
        ],
       "test3": [
           { "time": "15:00:00", "value": "B", "additionalInfo": "Additional information at 15h" },
           { "time": "16:00:00", "value": "C" },
           { "time": "17:00:00", "value": "D", "additionalInfo": "Additional information at 17h" }
        ],
        "generated_by": "author of tests"
    }
]

这是预期的结果:

{
    "2019-05-19T14:00:00": {
        "test1": 10,
        "test2": 11
    },
    "2019-05-19T15:00:00": {
        "test1": 12,
        "test3": "B",
        "test3.additionalInfo": "Additional information at 15h"
    },
    "2019-05-19T16:00:00": {
        "test2": 15,
        "test3": "C"
    },
    "2019-05-19T17:00:00": {
        "test1": 16,
        "test2": 17,
        "test3": "D",
        "test3.additionalInfo": "Additional information at 17h"
    },
    "date": "2019-05-19",
    "generated_by": "author of tests"
}

【问题讨论】:

    标签: python json transformation transpose


    【解决方案1】:
    data = '''  [
        {
           "date": "2019-05-19",
           "test1": [
               { "time": "14:00:00", "value": 10 },
               { "time": "15:00:00", "value": 12 },
               { "time": "17:00:00", "value": 16 }
            ],
           "test2": [
               { "time": "14:00:00", "value": 11 },
               { "time": "16:00:00", "value": 15 },
               { "time": "17:00:00", "value": 17 }
            ],
           "test3": [
               { "time": "15:00:00", "value": "B", "additionalInfo": "Additional information at 15h" },
               { "time": "16:00:00", "value": "C" },
               { "time": "17:00:00", "value": "D", "additionalInfo": "Additional information at 17h" }
            ],
            "generated_by": "author of tests"
        }
    ]'''
    
    import json
    from collections import defaultdict
    
    out = defaultdict(dict)
    for i in json.loads(data):
        d = i['date']
        for k, v in i.items():
            if not isinstance(v, list):
                continue
            for vv in v:
                out[d + 'T' + vv['time']][k] = vv['value']
                if 'additionalInfo' in vv:
                    out[d + 'T' + vv['time']][k + '.additionalInfo'] = vv['additionalInfo']
        out['date'] = d
        out['generated_by'] = i['generated_by']
    
    
    print(json.dumps(out, indent=4))
    

    打印:

    {
        "2019-05-19T14:00:00": {
            "test1": 10,
            "test2": 11
        },
        "2019-05-19T15:00:00": {
            "test1": 12,
            "test3": "B",
            "test3.additionalInfo": "Additional information at 15h"
        },
        "2019-05-19T17:00:00": {
            "test1": 16,
            "test2": 17,
            "test3": "D",
            "test3.additionalInfo": "Additional information at 17h"
        },
        "2019-05-19T16:00:00": {
            "test2": 15,
            "test3": "C"
        },
        "date": "2019-05-19",
        "generated_by": "author of tests"
    }
    

    【讨论】:

    • 我喜欢你这样做的方式,我会用这种方式测试代码,但 test 实际上并不是测试的前缀......它可以是每个测试的名称特殊测试,如“温度”、“湿度”或“阵风”
    • 我的代码是通用的,没有链接到特定的标签/项目、项目数量等。结构是唯一不变的...[{date, [ test_name -> {time, result, additional_fields}]}]
    • @outon 更新了我的答案。
    猜你喜欢
    • 1970-01-01
    • 2016-06-18
    • 2016-07-04
    • 2018-10-20
    • 1970-01-01
    • 1970-01-01
    • 2019-02-10
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多