【问题标题】:Get weighted average of multiple IDs in pandas获取 pandas 中多个 ID 的加权平均值
【发布时间】:2020-03-06 00:11:02
【问题描述】:

我有一个带有两个 ID、一个计数和一个平均值的 pandas 数据框。如何按两个 id 分组并获得加权平均值,以便得到以下数据集:

id1         id2       count   average
Person A    class 1   200     0.2
Person A    class 1   400     0.4
Person B    class 2   800     0.6
Person C    class 2   200     0.4
Person B    class 3   800     0.6
Person A    class 4   400     0.2
Person B    class 2   100     0.5

获得以下结果(以任何行顺序):

id1         id2       count   average
Person A    class 1   600     0.33
Person B    class 2   900     0.59
Person C    class 2   200     0.4
Person B    class 3   800     0.6
Person A    class 4   400     0.2

供参考:

pd.DataFrame({"id1" : ["Person A","Person A","Person B","Person C","Person B","Person A","Person B"],
              "id2" : ["class 1","class 1","class 2","class 2","class 3","class 4","class 2"],
              "count" : [200, 400, 800, 200, 800, 400, 100],
              "average" : [0.2, 0.4, 0.6, 0.4, 0.6, 0.2, 0.5]})

【问题讨论】:

    标签: python pandas grouping


    【解决方案1】:

    使用GroupBy.sumGroupBy.apply

    df['average'] = df['count'].mul(df['average'])
    grps = df.groupby(['id1', 'id2'], sort=False)
    g1 = grps['count'].sum()
    g2 = grps.apply(lambda x: x['average'].sum() / x['count'].sum())
    
    dfn = pd.concat([g1, g2.rename('average').round(2)], axis=1).reset_index()
    
            id1      id2  count  average
    0  Person A  class 1    600     0.33
    1  Person B  class 2    900     0.59
    2  Person C  class 2    200     0.40
    3  Person B  class 3    800     0.60
    4  Person A  class 4    400     0.20
    

    【讨论】:

      【解决方案2】:
      df.groupby(['id1','id2']).apply(lambda x: np.average(x.average, weights = x.countx))
      

      count 列的名称更改为其方法。

      【讨论】:

      • 难道你还不需要进行重组以获得计数和修复索引吗?
      【解决方案3】:

      您可以先创建平均列,然后按分组

      df.assign(average=lambda x: x['count'].mul(x['average'])).groupby(['id1', 'id2']).sum().assign(average=lambda x: x['average'] / x['count']).reset_index()

              id1      id2  count   average
      0  Person A  class 1    600  0.333333
      1  Person A  class 4    400  0.200000
      2  Person B  class 2    900  0.588889
      3  Person B  class 3    800  0.600000
      4  Person C  class 2    200  0.400000
      

      【讨论】:

        猜你喜欢
        • 2014-09-23
        • 2016-05-23
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2012-06-06
        • 2021-06-26
        • 1970-01-01
        相关资源
        最近更新 更多