【问题标题】:Comparing list with JSON and counting occurences将列表与 JSON 进行比较并计算出现次数
【发布时间】:2021-08-10 13:13:02
【问题描述】:

给定列表如下:

make = ['ford', 'fiat', 'nissan', 'suzuki', 'dacia']
model = ['x', 'y', 'z']
version = ['A', 'B', 'C']
typ = ['sedan', 'coupe', 'van', 'kombi']
infos = ['steering wheel problems', 'gearbox problems', 'broken engine', 'throttle problems', None]

total.append(make)
total.append(model)
total.append(version)
total.append(typ)
total.append(infos)

我需要创建这些列表的所有可能组合的列表列表,所以我这样做了:

combos = list(itertools.product(*total))
all_combos = [list(elem) for elem in combos]

现在我想比较一下,在 JSON 对象中找到与 all_combos 的项目中出现的值相同的项目,并计算这些出现的次数。我的 JSON 很大,看起来有点像:

data = [
{  'make': 'dacia'
   'model': 'x',
   'version': 'A',
   'typ': 'sedan',
   'infos': 'steering wheel problems'
}, ...]

我想得到如下输出:

output = [
    {  'make': 'dacia'
       'model': 'x',
       'version': 'A',
       'typ': 'sedan',
       'infos': 'steering wheel problems',
       'number_of_occurences_of_such_combination_of_fields_with__such_values': 75
    }, ...]

如何解决这样的任务?

【问题讨论】:

    标签: python json dictionary nested itertools


    【解决方案1】:

    如果我对您的理解正确,您希望在数据键 number_of_occurences_of_such_combination_of_fields_with__such_values 中添加每个字典:

    from operator import itemgetter
    from itertools import product
    
    make = ["ford", "fiat", "nissan", "suzuki", "dacia"]
    model = ["x", "y", "z"]
    version = ["A", "B", "C"]
    typ = ["sedan", "coupe", "van", "kombi"]
    infos = [
        "steering wheel problems",
        "gearbox problems",
        "broken engine",
        "throttle problems",
        None,
    ]
    
    total = [make, model, version, typ, infos]
    
    data = [
        {
            "make": "dacia",
            "model": "x",
            "version": "A",
            "typ": "sedan",
            "infos": "steering wheel problems",
        },
        {
            "make": "dacia",
            "model": "x",
            "version": "A",
            "typ": "sedan",
            "infos": "steering wheel problems",
        },
        {
            "make": "ford",
            "model": "x",
            "version": "A",
            "typ": "sedan",
            "infos": "steering wheel problems",
        },
    ]
    
    i = itemgetter("make", "model", "version", "typ", "infos")
    
    cnt = {}
    for c in itertools.product(*total):
        for d in data:
            if i(d) == c:
                cnt.setdefault(c, []).append(d)
    
    for k, v in cnt.items():
        for d in v:
            d[
                "number_of_occurences_of_such_combination_of_fields_with__such_values"
            ] = len(v)
    
    print(data)
    

    打印:

    [
        {
            "make": "dacia",
            "model": "x",
            "version": "A",
            "typ": "sedan",
            "infos": "steering wheel problems",
            "number_of_occurences_of_such_combination_of_fields_with__such_values": 2,
        },
        {
            "make": "dacia",
            "model": "x",
            "version": "A",
            "typ": "sedan",
            "infos": "steering wheel problems",
            "number_of_occurences_of_such_combination_of_fields_with__such_values": 2,
        },
        {
            "make": "ford",
            "model": "x",
            "version": "A",
            "typ": "sedan",
            "infos": "steering wheel problems",
            "number_of_occurences_of_such_combination_of_fields_with__such_values": 1,
        },
    ]
    

    版本 2:(没有 itertools.product):

    from operator import itemgetter
    
    
    i = itemgetter("make", "model", "version", "typ", "infos")
    
    cnt = {}
    for d in data:
        c = i(d)
        cnt[c] = cnt.get(c, 0) + 1
    
    for d in data:
        d[
            "number_of_occurences_of_such_combination_of_fields_with__such_values"
        ] = cnt[i(d)]
    
    print(data)
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2023-02-23
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多