【问题标题】:Problem parsing JSON file - fields not stored correctly解析 JSON 文件时出现问题 - 字段未正确存储
【发布时间】:2021-05-17 05:50:53
【问题描述】:

我有一个我正在尝试解析的 JSON 文件。我的脚本有一些打印语句来检查我是否正确解析了各个字段。在for 循环中,idsip_metadata[ip] 似乎就是这样。但是,当我最后打印出整个字典时,ids 列表始终是最后一个。只有ids 有这个问题,其他字段存储正确。我不明白为什么ids 没有正确存储。

JSON 文件:

{
    "count": 4,
    "data": [
        {
            "ip": "1.2.3.4",
            "cty": {
                "country": "US",
                "organization": "ABC"
            },
            "info": {
                "p": [
                    {
                        "id": 123,
                        "grp": "A"
                    },
                    {
                        "id": 234,
                        "grp": "B"
                    },
                    {
                        "id": 345,
                        "grp": "C"
                    },
                    {
                        "id": 456,
                        "grp": "D"
                    }
                ]
            }
        },
        {
            "ip": "2.3.4.5",
            "cty": {
                "country": "US",
                "organization": "ABC"
            },
            "info": {
                "p": [
                    {
                        "id": 111,
                        "grp": "A"
                    },
                    {
                        "id": 222,
                        "grp": "B"
                    },
                    {
                        "id": 333,
                        "grp": "C"
                    },
                    {
                        "id": 444,
                        "grp": "D"
                    }
                ]
            }
        },
        {
            "ip": "1.2.3.1",
            "cty": {
                "country": "AU",
                "organization": "ABC"
            },
            "info": {
                "p": [
                    {
                        "id": 222,
                        "grp": "A"
                    },
                    {
                        "id": 333,
                        "grp": "B"
                    },
                    {
                        "id": 444,
                        "grp": "C"
                    }
                ]
            }
        },
        {
            "ip": "10.2.3.4",
            "cty": {
                "country": "US",
                "organization": "DDD"
            },
            "info": {
                "p": [
                    {
                        "id": 555,
                        "grp": "A"
                    },
                    {
                        "id": 666,
                        "grp": "B"
                    },
                    {
                        "id": 777,
                        "grp": "C"
                    },
                    {
                        "id": 888,
                        "grp": "D"
                    }
                ]
            }
        }
    ],
    "status": "ok"
}

我的python脚本是

import json
import glob
from collections import defaultdict

ip_metadata = defaultdict(list)

def main():
    for json_file in glob.glob("test_input/test_json.json"):
        with open(json_file, "r") as fin:
            ids = []
            json_data = json.load(fin)
            if json_data["count"] > 0:
                for data in json_data['data']:
                    ip = data['ip']
                    country = data['cty']['country']
                    organization = data['cty']['organization']
                    ids[:] = []
                    ids_2 = data['info']['p']
                    for idss in ids_2:
                        id = idss['id']
                        grp = idss['grp']
                        ids.append((id,grp))
                    
                    print(ids)
                    ip_metadata[ip].append((country,organization,ids))
                    print(ip_metadata[ip])
                    
    print("=============================")
    for k, v in ip_metadata.items():
        print(k,v)
    
if __name__ == '__main__':
    main()

输出是

[(123, 'A'), (234, 'B'), (345, 'C'), (456, 'D')]
[('US', 'ABC', [(123, 'A'), (234, 'B'), (345, 'C'), (456, 'D')])]
[(111, 'A'), (222, 'B'), (333, 'C'), (444, 'D')]
[('US', 'ABC', [(111, 'A'), (222, 'B'), (333, 'C'), (444, 'D')])]
[(222, 'A'), (333, 'B'), (444, 'C')]
[('AU', 'ABC', [(222, 'A'), (333, 'B'), (444, 'C')])]
[(555, 'A'), (666, 'B'), (777, 'C'), (888, 'D')]
[('US', 'DDD', [(555, 'A'), (666, 'B'), (777, 'C'), (888, 'D')])]
=============================
1.2.3.4 [('US', 'ABC', [(555, 'A'), (666, 'B'), (777, 'C'), (888, 'D')])]
2.3.4.5 [('US', 'ABC', [(555, 'A'), (666, 'B'), (777, 'C'), (888, 'D')])]
1.2.3.1 [('AU', 'ABC', [(555, 'A'), (666, 'B'), (777, 'C'), (888, 'D')])]
10.2.3.4 [('US', 'DDD', [(555, 'A'), (666, 'B'), (777, 'C'), (888, 'D')])]

【问题讨论】:

  • 你为什么要ids[:] = []?它清空了ids中的内容
  • @rdas 这是因为我只想将每个ipidgrp 存储在data 块中。否则ids 将包含所有idgrp 用于所有ip

标签: json python-3.x dictionary


【解决方案1】:

我设法通过删除open() 行下的ids = [] 行并将ids[:] = [] 替换为ids = [] 来解决此问题。我不太明白为什么这会解决问题,因为我不知道为什么 ids 的值会替换以前存储的不同键的 ids 值。但现在可以了。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2023-03-16
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-07-19
    • 1970-01-01
    相关资源
    最近更新 更多