【问题标题】:Remove key-value pair based on grouping from dictionaries in python基于从python中的字典分组删除键值对
【发布时间】:2019-09-17 14:40:08
【问题描述】:

我有一个包含多个字典的 JSON 文件 A.json。我想从键“模型”grouped by brand 中删除常见的键值对。

例如,考虑品牌:“福特”:

{"Number": '123', "brand": "Ford", "model":{"Mustang1":"2.64", "Mustang2":"3.00", "Mustang3":"1.00", "Mustang4":"1.64"}}

{"Number": '891', "brand": "Ford", "model":{"Mustang1":"2.64", "Mustang8":"3.00", "Mustang3":"1.00", "Mustang6":"1.64"}}

两个字典中通用的键 model 中的键是 Mustang1Mustang3。所以我从模型中删除了两个键值对。 最终的字典是:

 {"Number": '123', "brand": "Ford", "model":{"Mustang2":"3.00", "Mustang4":"1.64"}}
{"Number": '891', "brand": "Ford", "model":{"Mustang8":"3.00", "Mustang6":"1.64"}}

A.json

{"Number": '123', "brand": "Ford", "model":{"Mustang1":"2.64", "Mustang2":"3.00", "Mustang3":"1.00", "Mustang4":"1.64"}}
{"Number": '321', "brand": "Toyota", "model":{"Camry":"2.64", "Prius":"3.00", "Corolla":"1.00", "Tundra":"1.64"}}
{"Number": '111', "brand": "Honda", "model":{"Accord":"2.64", "Civic":"3.00", "Insight":"1.00", "Pilot":"1.64"}}
{"Number": '891', "brand": "Ford", "model":{"Mustang1":"2.64", "Mustang8":"3.00", "Mustang3":"1.00", "Mustang6":"1.64"}}
{"Number": '745', "brand": "Toyota", "model":{"Camry":"2.64", "Sienna":"3.00", "4Runner":"1.00", "Prius":"1.64"}}
{"Number": '325', "brand": "Honda", "model":{"Accord":"2.64", "Passport":"3.00", "HR-V":"1.00", "Pilot":"1.64"}}
{"Number": '745', "brand": "Accura", "model":{"TLX":"2.64", "MDX":"3.00"}}
{"Number": '325', "brand": "Accura", "model":{"TLX":"2.64", "MDX":"3.00"}}

预期结果: 结果.json

{"Number": '123', "brand": "Ford", "model":{"Mustang2":"3.00", "Mustang4":"1.64"}}
{"Number": '321', "brand": "Toyota", "model":{"Corolla":"1.00", "Tundra":"1.64"}}
{"Number": '111', "brand": "Honda", "model":{"Civic":"3.00", "Insight":"1.00", "Pilot":"1.64"}}
{"Number": '891', "brand": "Ford", "model":{"Mustang8":"3.00", "Mustang6":"1.64"}}
{"Number": '745', "brand": "Toyota", "model":{"Sienna":"3.00", "4Runner":"1.00"}}
{"Number": '325', "brand": "Honda", "model":{"Passport":"3.00", "HR-V":"1.00", "Civic Type R":"1.64"}}
{"Number": '745', "brand": "Accura", "model":{}}
{"Number": '325', "brand": "Accura", "model":{}}

【问题讨论】:

    标签: python json dictionary grouping key-value


    【解决方案1】:

    首先,您的 A.json 不是常规的 json 文件。这是修正后的版本:

    [{"Number": "123", "brand": "Ford", "model":{"Mustang1":"2.64", "Mustang2":"3.00", "Mustang3":"1.00", "Mustang4":"1.64"}},
    {"Number": "321", "brand": "Toyota", "model":{"Camry":"2.64", "Prius":"3.00", "Corolla":"1.00", "Tundra":"1.64"}},
    {"Number": "111", "brand": "Honda", "model":{"Accord":"2.64", "Civic":"3.00", "Insight":"1.00", "Pilot":"1.64"}},
    {"Number": "891", "brand": "Ford", "model":{"Mustang1":"2.64", "Mustang8":"3.00", "Mustang3":"1.00", "Mustang6":"1.64"}},
    {"Number": "745", "brand": "Toyota", "model":{"Camry":"2.64", "Sienna":"3.00", "4Runner":"1.00", "Prius":"1.64"}},
    {"Number": "325", "brand": "Honda", "model":{"Accord":"2.64", "Passport":"3.00", "HR-V":"1.00", "Pilot":"1.64"}},
    {"Number": "745", "brand": "Accura", "model":{"TLX":"2.64", "MDX":"3.00"}},
    {"Number": "325", "brand": "Accura", "model":{"TLX":"2.64", "MDX":"3.00"}}]
    

    文件的内容应该用json模块解析:

    import io # to test without a file
    f = io.StringIO(json_text) # json_text is a string containing the text above
    
    import json
    ds = json.load(f)
    

    其次,你要按品牌建立一个set的常用型号:

    common_by_brand = {}
    for d in ds:
        if d["brand"] in common_by_brand:
            common_by_brand[d["brand"]] &= set(d["model"])
        else:
            common_by_brand[d["brand"]] = set(d["model"])
        # {'Ford': {'Mustang1', 'Mustang3'}, 'Toyota': {'Camry', 'Prius'}, 'Honda': {'Accord', 'Pilot'}, 'Accura': {'TLX', 'MDX'}}
    

    第三,只需遍历列表并删除那些常见的模型:

    for d in ds:
        common = common_by_brand[d["brand"]]
        d["model"] = {k: v for k, v in d["model"].items() if k not in common}
    # [{'Number': '123', 'brand': 'Ford', 'model': {'Mustang2': '3.00', 'Mustang4': '1.64'}}, {'Number': '321', 'brand': 'Toyota', 'model': {'Corolla': '1.00', 'Tundra': '1.64'}}, {'Number': '111', 'brand': 'Honda', 'model': {'Civic': '3.00', 'Insight': '1.00'}}, {'Number': '891', 'brand': 'Ford', 'model': {'Mustang8': '3.00', 'Mustang6': '1.64'}}, {'Number': '745', 'brand': 'Toyota', 'model': {'Sienna': '3.00', '4Runner': '1.00'}}, {'Number': '325', 'brand': 'Honda', 'model': {'Passport': '3.00', 'HR-V': '1.00'}}, {'Number': '745', 'brand': 'Accura', 'model': {}}, {'Number': '325', 'brand': 'Accura', 'model': {}}]
    

    四、将结果以json格式写入文件:

    g = io.StringIO()
    json.dump(ds, g)
    print (g.getvalue())
    

    格式化输出:

    [{"Number": "123", "brand": "Ford", "model": {"Mustang2": "3.00", "Mustang4": "1.64"}},
    {"Number": "321", "brand": "Toyota", "model": {"Corolla": "1.00", "Tundra": "1.64"}},
    {"Number": "111", "brand": "Honda", "model": {"Civic": "3.00", "Insight": "1.00"}},
    {"Number": "891", "brand": "Ford", "model": {"Mustang8": "3.00", "Mustang6": "1.64"}},
    {"Number": "745", "brand": "Toyota", "model": {"Sienna": "3.00", "4Runner": "1.00"}},
    {"Number": "325", "brand": "Honda", "model": {"Passport": "3.00", "HR-V": "1.00"}},
    {"Number": "745", "brand": "Accura", "model": {}},
    {"Number": "325", "brand": "Accura", "model": {}}]
    

    【讨论】:

    • 非常感谢。我不确定如何在字典中获取公共键。这太棒了!
    【解决方案2】:

    首先,您需要使用json builtin library 在python 中加载json。

    然后,有几种方法可以实现这一点。例如,您可以遍历每个 dict 并在每次迭代时更新 Counter。然后你删除每一个被多次计算的键。

    最后,您再次使用 json 库将生成的 dict 转储到新文件中。

    【讨论】:

    • 我认为 Counter 方法在这里行不通。因为我们需要计算按“品牌”分组的关键“型号”中的键。例如,如果品牌是“Ford”,则从具有“品牌”值 Ford 的字典中常见的键模型中删除键。
    【解决方案3】:

    我假设您将使用标准的JSON 格式。您需要检查字典中valuetype 何时为dict 类型。 isinstance() 方法可用于此目的。可以使用如下代码sn -p:

    for key,value in your_json.items():
        if isinstance(value, dict):
           your_json[key]={}
    

    我希望这可能会奏效。 干杯:)

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2020-03-28
      • 1970-01-01
      • 2016-02-20
      • 1970-01-01
      • 2019-05-04
      • 2016-03-28
      • 2016-07-03
      相关资源
      最近更新 更多