【问题标题】:Isolate dictionary keys which have multiple duplicate values隔离具有多个重复值的字典键
【发布时间】:2017-10-26 09:08:17
【问题描述】:
d = {'Name1': ['Male', '18'],
     'Name2': ['Male', '16'], 
     'Name3': ['Male', '18'],
     'Name4': ['Female', '18'], 
     'Name5': ['Female', '18']}

我正在尝试找到一种方法将重复键隔离到列表中(如果有)。喜欢:

['Name1', 'Name3']
['Name4', 'Name5']

我怎样才能做到这一点?谢谢

【问题讨论】:

    标签: python dictionary duplicates


    【解决方案1】:

    一个命令式的解决方案是只遍历字典并将项目添加到另一个使用性别年龄元组作为键的字典中,例如:

    # using a defaultdict, which automatically adds an empty list for missing keys when first accesses
    from collections import defaultdict
    by_data = defaultdict(list) 
    for name, data in d.items():
        # turn the data into something immutable, so it can be used as a dictionary key
        data_tuple = tuple(data)
        by_data[data_tuple].append(name)
    

    结果将是:

    {('Female', '18'): ['Name4', 'Name5'],
     ('Male', '16'): ['Name2'],
     ('Male', '18'): ['Name1', 'Name3']})
    

    如果您只对重复项感兴趣,您可以过滤掉只有一个值的条目

    【讨论】:

      【解决方案2】:

      试试这个:

      d = {'Name1': ['Male', '18'],
       'Name2': ['Male', '16'], 
       'Name3': ['Male', '18'],
       'Name4': ['Female', '18'], 
       'Name5': ['Female', '18']}
      
      ages = {} #create a dictionary to hold items with identical ages
      
      #loop over all the items in the dictionary
      for key in d.keys():
          age = d[key][1]
      
          #if the ages dictionary still does not have an item 
          #for the age we create an array to hold items with the same age
          if(age not in ages.keys()):
              ages[age] = [] 
      
          ages[age].append(key) #finally append items with the same ages together
      
      #loop over all the items in the ages dictionary
      for value in ages.values():
          if(len(value) > 1):#if we have more than one item in the ages dictionary
              print(value) #print it
      

      【讨论】:

        【解决方案3】:

        我猜你的意思是重复值而不是键,在这种情况下你可以用 pandas 做到这一点:

        import pandas as pd
        df = pd.DataFrame(d).T #load the data into a dataframe, and transpose it
        df.index[df.duplicated(keep = False)] 
        

        df.duplicated(keep = False) 为您提供一系列 True/False,其中只要该项目有重复,该值为 True,否则为 False。我们用它来索引行名,即'Name1','Name2'等。

        【讨论】:

          猜你喜欢
          • 2019-06-09
          • 2017-07-07
          • 2016-10-21
          • 1970-01-01
          • 1970-01-01
          • 2013-11-21
          • 2012-10-22
          • 2022-08-18
          • 1970-01-01
          相关资源
          最近更新 更多