【问题标题】:How to remove all duplicate values (under same or different key(s)) from a dictionary?如何从字典中删除所有重复值(在相同或不同键下)?
【发布时间】:2021-10-15 01:39:35
【问题描述】:

我有一本像下面这样的字典:

{"A": ["a", "b", "c"],
 "B": ["a", "d", "f"],
 "C": ["i", "i", "j"]}

我想把它改成如下:

{"A": ["b", "c"],
 "B": ["d", "f"],
 "C": ["j"]}

也就是说,所有重复的值都被删除,无论它们出现在同一个键还是不同的键中。如何高效实现?

【问题讨论】:

  • 请用您尝试过的代码更新您的问题。
  • collections.Counter计算值,然后用它来过滤每个列表。
  • “所有重复的值都被删除了” - 为什么所有的“a”都被删除了,但不是所有的“i”?
  • @sj95126 明白,更新我的问题。

标签: python python-3.x dictionary duplicates


【解决方案1】:

此代码将删除所有重复项:

from collections import Counter

def remove_all_dupes(d):
    c = Counter()
    for v in d.values():
        c.update(v)

    for k,v in d.items():
        d[k] = [item for item in v if c[item] == 1]
    return d

d = {"A": ["a", "b", "c"],
 "B": ["a", "d", "f"],
 "C": ["i", "i", "j"]}

print(d)
remove_all_dupes(d)
print(d)

按要求输出。

我希望这是相当有效的,即O(n),因为它只循环遍历所有值两次,消除重复的查找应该是O(1)

【讨论】:

    【解决方案2】:

    试试这个:

    from itertools import chain
    
    d = {"A": ["a", "b", "c"],
         "B": ["a", "d", "f"],
         "C": ["i", "i", "j"]}
    new_dic = {}
    values = list(chain(*d.values()))
    for key, value in d.items():
        new_dic[key] = [x for x in value if values.count(x) == 1]
          
    print(new_dic)
    

    使用字典理解:

    new_dic = {key: [x for x in value if list(chain(*d.values())).count(x)==1] for key, value in d.items()}
    

    输出:

    {"A": ["b", "c"], "B": ["d", "f"], "C": ["j"]}
    

    【讨论】:

    • 这个解决方案的问题是.count(x) 必须为每个x 扫描整个values
    【解决方案3】:

    此代码不使用任何导入。

    data = {"A": ["a", "b", "c"], "B": ["a", "d", "f"], "C": ["i", "i", "j"]}
    
    # vals = []
    # for ky in data.keys():
    #     vals = vals + data[ky]
    
    vals = sum(data.values(), [])    
    
    dups = set([val for val in vals if vals.count(val) > 1])
    
    data_deduped = {
        ky: [val for val in data[ky] if not val in dups] for ky in data.keys()
    }
    
    print(data_deduped)
    
    
    Sample Output
    {'A': ['b', 'c'], 'B': ['d', 'f'], 'C': ['j']}
    

    【讨论】:

      【解决方案4】:

      您可以使用pip install more-itertoolsflatten 快速创建所有值的列表,并使用collections 中的Counter 对它们进行计数。

      from more_itertools import flatten
      from collections import Counter
      
      def remove_duplicates(d):
          all_values = list(flatten(d.values()))
          count = Counter(all_values)
          filtered_dict = {key: [v for v in value if count[v] == 1] for key, value in d.items()}
          return filtered_dict
      

      【讨论】:

        【解决方案5】:
        from collections import Counter
        
        d_input = {"A": ["a", "b", "c"], "B": ["a", "d", "f"], "C": ["i", "i", "j"]}
        d_correct_answer = {"A": ["b", "c"], "B": ["d", "f"], "C": ["j"]}
        
        # first make a giant list of all lists in dictionary
        giant_list = []
        for key in d_input:
            partial_list = d_input[key]
            # build a big list of all items
            giant_list.extend(partial_list)
        # use counter to find items
        counted = Counter(giant_list)
        # now build remove list
        remove_list = []
        for key in counted:
            if counted[key] > 1:
                remove_list.append(key)
        # now loop and remove by only adding proper results to new dict
        result = {}
        for key in d_input:
            partial_list = d_input[key]
            new_partial_list = []
            for item in partial_list:
                if item not in remove_list:
                    new_partial_list.append(item)
            result[key] = new_partial_list
        print(f'it is {result == d_correct_answer} that this code works')
        

        【讨论】:

        • 对于一个非常简单的问题,这个解决方案似乎又长又复杂
        • 它和您的解决方案一样复杂,只是没有隐藏在引擎盖下。我以一种希望自己记录思考过程的方式一步一步地写出来。
        • 我不同意。您的代码需要更多行并且更难阅读
        猜你喜欢
        • 2021-08-17
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2020-12-18
        • 1970-01-01
        • 2021-04-28
        • 1970-01-01
        相关资源
        最近更新 更多