【问题标题】:Finding count of tuples with same first and third item in list of tuples在元组列表中查找具有相同第一项和第三项的元组计数
【发布时间】:2018-12-07 11:57:48
【问题描述】:

我有一个元组列表,每个元组包含三个项目:

z = [(1, 4, 2015), (1, 11, 2015), (1, 18, 2015), (1, 25, 2015), (2, 1, 2015), (2, 8, 2015), (2, 15, 2015), (2, 22, 2015), (3, 1, 2015), (3, 8, 2015), (3, 15, 2015), (3, 22, 2015), (3, 29, 2015), (4, 5, 2015), (4, 12, 2015), (4, 19, 2015), (4, 26, 2015), (5, 3, 2015), (5, 10, 2015), (5, 17, 2015), (5, 24, 2015), (5, 31, 2015), (6, 7, 2015), (6, 14, 2015), (6, 21, 2015), (6, 28, 2015), (7, 5, 2015), (7, 12, 2015), (7, 19, 2015), (7, 26, 2015), (8, 2, 2015), (8, 9, 2015), (8, 16, 2015), (8, 23, 2015), (8, 30, 2015), (9, 6, 2015), (9, 13, 2015), (9, 20, 2015), (9, 27, 2015), (10, 4, 2015), (10, 11, 2015), (10, 18, 2015), (10, 25, 2015), (11, 1, 2015), (11, 8, 2015), (11, 15, 2015), (11, 22, 2015), (11, 29, 2015), (12, 6, 2015), (12, 13, 2015), (12, 20, 2015), (12, 27, 2015), (1, 3, 2016), (1, 10, 2016), (1, 17, 2016), (1, 24, 2016), (1, 31, 2016)]

我想在列表中查找具有相同第一项和第三项的元组数,例如第一项 1 和第三项 2015,有 4 个元组;第一项 2 和第三项 2015 有 4 个元组。

我试过了:

for tup in z:
    a=tup[0]
    b=tup[2]
    print(len(set({a:b})))

它没有给出想要的结果。怎么做?

【问题讨论】:

    标签: python python-3.x pandas dictionary counting


    【解决方案1】:

    在纯 python 中使用Counter 和生成器,感谢@Felix:

    from collections import Counter
    
    out = Counter((x[0], x[2]) for x in z)
    print (out)
    Counter({(3, 2015): 5, 
             (5, 2015): 5, 
             (8, 2015): 5,
             (11, 2015): 5, 
             (1, 2016): 5, 
             (1, 2015): 4, 
             (2, 2015): 4, 
             (4, 2015): 4, 
             (6, 2015): 4, 
             (7, 2015): 4, 
             (9, 2015): 4, 
             (10, 2015): 4,
             (12, 2015): 4})
    

    GroupBy.size 的 pandas 聚合计数中,输出为 Series

    s = pd.DataFrame(z).groupby([0,2]).size()
    print (s)
    0   2   
    1   2015    4
        2016    5
    2   2015    4
    3   2015    5
    4   2015    4
    5   2015    5
    6   2015    4
    7   2015    4
    8   2015    5
    9   2015    4
    10  2015    4
    11  2015    5
    12  2015    4
    dtype: int64
    

    【讨论】:

    • 您可以在 Counter(...) 调用中使用生成器表达式而不是列表推导式,否则很好的解决方案 ;-)
    • @Felix - 谢谢。
    • 在这种情况下你甚至不需要括号。 Counter((x[0], x[2]) for x in z) 工作得很好。
    【解决方案2】:

    使用collections

    例如:

    import collections
    d = collections.defaultdict(int)
    z = [(1, 4, 2015), (1, 11, 2015), (1, 18, 2015), (1, 25, 2015), (2, 1, 2015), (2, 8, 2015), (2, 15, 2015), (2, 22, 2015), (3, 1, 2015), (3, 8, 2015), (3, 15, 2015), (3, 22, 2015), (3, 29, 2015), (4, 5, 2015), (4, 12, 2015), (4, 19, 2015), (4, 26, 2015), (5, 3, 2015), (5, 10, 2015), (5, 17, 2015), (5, 24, 2015), (5, 31, 2015), (6, 7, 2015), (6, 14, 2015), (6, 21, 2015), (6, 28, 2015), (7, 5, 2015), (7, 12, 2015), (7, 19, 2015), (7, 26, 2015), (8, 2, 2015), (8, 9, 2015), (8, 16, 2015), (8, 23, 2015), (8, 30, 2015), (9, 6, 2015), (9, 13, 2015), (9, 20, 2015), (9, 27, 2015), (10, 4, 2015), (10, 11, 2015), (10, 18, 2015), (10, 25, 2015), (11, 1, 2015), (11, 8, 2015), (11, 15, 2015), (11, 22, 2015), (11, 29, 2015), (12, 6, 2015), (12, 13, 2015), (12, 20, 2015), (12, 27, 2015), (1, 3, 2016), (1, 10, 2016), (1, 17, 2016), (1, 24, 2016), (1, 31, 2016)]
    for i in z:
        d[(i[0], i[2])] += 1
    print(d)
    

    输出:

    defaultdict(<type 'int'>, {(10, 2015): 4, (5, 2015): 5, (2, 2015): 4, (11, 2015): 5, (6, 2015): 4, (8, 2015): 5, (3, 2015): 5, (12, 2015): 4, (7, 2015): 4, (9, 2015): 4, (4, 2015): 4, (1, 2016): 5, (1, 2015): 4})
    

    【讨论】:

    • OP 想要第一项和第三项相同的情况,而不是。这可以通过一个 dict 来完成,其键是第一项和第三项的元组。
    • 不,如果你要使用集合,你应该直接使用 Counter 而不是重新实现它
    【解决方案3】:

    使用标准 python 的itertools.groupby:

    from itertools import groupby
    
    for grp, elmts in groupby(z, lambda x: (x[0], x[2])):
        print(grp, len(list(elmts)))
    

    编辑:

    使用operator.itemgetter 而不是lambda 的更好解决方案:

    from operator import itemgetter
    from itertools import groupby
    
    for grp, elmts in groupby(z, itemgetter(0, 2)):
        print(grp, len(list(elmts)))
    

    输出:

    (1, 2015) 4
    (2, 2015) 4
    (3, 2015) 5
    (4, 2015) 4
    (5, 2015) 5
    (6, 2015) 4
    (7, 2015) 4
    (8, 2015) 5
    (9, 2015) 4
    (10, 2015) 4
    (11, 2015) 5
    (12, 2015) 4
    (1, 2016) 5
    

    【讨论】:

    • {k: len(list(v)) for k, v in groupby(z, lambda x: (x[0], x[-1]))}
    • for grp, elmts in groupby(z, key=operator.itemgetter(0, 2))
    【解决方案4】:

    collections.Counteroperator.itemgetter 一起使用:

    from collections import Counter
    from operator import itemgetter
    
    res = Counter(map(itemgetter(0, 2), z))
    
    print(res)
    
    Counter({(1, 2015): 4,
             (1, 2016): 5,
             (2, 2015): 4,
             (3, 2015): 5,
             (4, 2015): 4,
             (5, 2015): 5,
             (6, 2015): 4,
             (7, 2015): 4,
             (8, 2015): 5,
             (9, 2015): 4,
             (10, 2015): 4,
             (11, 2015): 5,
             (12, 2015): 4})
    

    【讨论】:

      【解决方案5】:

      您可以将计数存储在 dict 中,以一个元组为键,该元组由原始元组列表中的第一项和第三项组成,例如:

      import collections
      
      z = [(1, 4, 2015), (1, 11, 2015), (1, 18, 2015), (1, 25, 2015), (2, 1, 2015), (2, 8, 2015),
           (2, 15, 2015), (2, 22, 2015), (3, 1, 2015), (3, 8, 2015), (3, 15, 2015), (3, 22, 2015),
           (3, 29, 2015), (4, 5, 2015), (4, 12, 2015), (4, 19, 2015), (4, 26, 2015), (5, 3, 2015),
           (5, 10, 2015), (5, 17, 2015), (5, 24, 2015), (5, 31, 2015), (6, 7, 2015), (6, 14, 2015),
           (6, 21, 2015), (6, 28, 2015), (7, 5, 2015), (7, 12, 2015), (7, 19, 2015), (7, 26, 2015),
           (8, 2, 2015), (8, 9, 2015), (8, 16, 2015), (8, 23, 2015), (8, 30, 2015), (9, 6, 2015),
           (9, 13, 2015), (9, 20, 2015), (9, 27, 2015), (10, 4, 2015), (10, 11, 2015),
           (10, 18, 2015), (10, 25, 2015), (11, 1, 2015), (11, 8, 2015), (11, 15, 2015),
           (11, 22, 2015), (11, 29, 2015), (12, 6, 2015), (12, 13, 2015), (12, 20, 2015),
           (12, 27, 2015), (1, 3, 2016), (1, 10, 2016), (1, 17, 2016), (1, 24, 2016), (1, 31, 2016)]
      
      counter = collections.defaultdict(int)  # Use a dict factory to save some time
      for element in z:  # iterate over the tuples
          counter[(element[0], element[2])] += 1  # increase the count for each match
      
      # finally, lets print the results
      for k, count in counter.items():
          print("{}: {}".format(k, count))
      

      这会给你:

      (2015 年 1 月):4 (2, 2015): 4 (3, 2015): 5 (4, 2015): 4 (5, 2015): 5 (2015 年 6 月):4 (7, 2015): 4 (8, 2015): 5 (9, 2015): 4 (2015 年 10 月):4 (2015 年 11 月):5 (2015 年 12 月):4 (1, 2016): 5

      【讨论】:

        【解决方案6】:

        from collections import Counter tmp = [(x[0],x[2]) for x in z] print(Counter(tmp))

        输出会像 Counter({(5, 2015): 5, (11, 2015): 5, (8, 2015): 5, (3, 2015): 5, (1, 2016): 5, (10, 2015): 4, (2, 2015): 4, (6, 2015): 4, (12, 2015): 4, (7, 2015): 4, (9, 2015): 4, (4, 2015): 4, (1, 2015): 4})

        【讨论】:

          【解决方案7】:

          试试这个:

          z = [(1, 4, 2015), (1, 11, 2015), (1, 18, 2015), (1, 25, 2015), (2, 1, 2015), (2, 8, 2015), (2, 15, 2015), (2, 22, 2015), (3, 1, 2015), (3, 8, 2015), (3, 15, 2015), (3, 22, 2015), (3, 29, 2015), (4, 5, 2015), (4, 12, 2015), (4, 19, 2015), (4, 26, 2015), (5, 3, 2015), (5, 10, 2015), (5, 17, 2015), (5, 24, 2015), (5, 31, 2015), (6, 7, 2015), (6, 14, 2015), (6, 21, 2015), (6, 28, 2015), (7, 5, 2015), (7, 12, 2015), (7, 19, 2015), (7, 26, 2015), (8, 2, 2015), (8, 9, 2015), (8, 16, 2015), (8, 23, 2015), (8, 30, 2015), (9, 6, 2015), (9, 13, 2015), (9, 20, 2015), (9, 27, 2015), (10, 4, 2015), (10, 11, 2015), (10, 18, 2015), (10, 25, 2015), (11, 1, 2015), (11, 8, 2015), (11, 15, 2015), (11, 22, 2015), (11, 29, 2015), (12, 6, 2015), (12, 13, 2015), (12, 20, 2015), (12, 27, 2015), (1, 3, 2016), (1, 10, 2016), (1, 17, 2016), (1, 24, 2016), (1, 31, 2016)]
          newz = [(i[0],i[-1]) for i in z]
          for i in list(set(newz)):
             print(str(i)+' '+str(newz.count(i)))
          

          输出:

          (10, 2015) 4
          (5, 2015) 5
          (2, 2015) 4
          (11, 2015) 5
          (6, 2015) 4
          (8, 2015) 5
          (3, 2015) 5
          (12, 2015) 4
          (7, 2015) 4
          (9, 2015) 4
          (1, 2016) 5
          (4, 2015) 4
          (1, 2015) 4
          

          【讨论】:

            【解决方案8】:

            groupby以外的解决方案,

            import pprint
            import random
            
            from collections import Counter
            
            z = [] # creating random dates as user has 2 years, won't work if year range increases
            
            num_dates = 20
            counts_by_month_and_year = Counter()
            
            while len(z) < num_dates:
                new = (random.randrange(1, 31), random.randrange(1, 12), random.randrange(2015, 2016))
            
                z.append(new)
                counts_by_month_and_year[(new[0], new[2])] += 1
            
            
            pprint.pprint(dict(counts_by_month_and_year)) # formatting the output 
            
            {(1, 2015): 1,
             (3, 2015): 1,
             (4, 2015): 1,
             (5, 2015): 1,
             (7, 2015): 1,
             (8, 2015): 2,
             (9, 2015): 1,
             (11, 2015): 1,
             (13, 2015): 1,
             (16, 2015): 1,
             (17, 2015): 1,
             (20, 2015): 1,
             (21, 2015): 2,
             (22, 2015): 1,
             (25, 2015): 1,
             (26, 2015): 1,
             (27, 2015): 2}
            
            [Program finished] 
            

            【讨论】:

              猜你喜欢
              • 2016-05-01
              • 1970-01-01
              • 2021-03-14
              • 1970-01-01
              • 2015-12-04
              • 1970-01-01
              • 1970-01-01
              • 2021-03-14
              • 1970-01-01
              相关资源
              最近更新 更多