【问题标题】:Get Positive and Negative Value Counts Using Groupby Multiple Columns Pandas使用 Groupby 多列 Pandas 获取正值和负值计数
【发布时间】:2020-05-15 11:18:54
【问题描述】:

我有一个看起来像这样的 pandas DataFrame:

Person    Year    Weight Lost/Gained
Joe       2015          -5.7
Bryan     2015           7.8
Kelly     2015          -16.2
Frank     2016           10.3
Bill      2016          -22.1

我想按年份获取负值和正值的计数,并获得正值和负值的平均值。结果可能在新数据框中或相同的数据框中。如果它在同一个中,我希望结果如下所示:

Person    Year    Weight Lost/Gained    Pos Count    Neg Count      Pos Avg.     Neg Avg.
Joe       2015          -5.7                1           2             7.8         -10.95
Bryan     2015           7.8                1           2             7.8         -10.95
Kelly     2015          -16.2               1           2             7.8         -10.95
Frank     2016           10.3               1           1            10.3         -22.1
Bill      2016          -22.1               1           1            10.3         -22.1

我能找到并尝试实施的最接近的答案可以在这里找到: How to sum negative and positive values separately when using groupby in pandas?

但是,我真的不想重新排列整个数据框,因为我的实际数据框要大得多。

【问题讨论】:

    标签: python-3.x pandas count pandas-groupby


    【解决方案1】:

    这是一种方法:

    # custom function
    def func(f):
        pos = f['WeightLost'].gt(0)
        neg = f['WeightLost'].lt(0)
        pos_avg = f.loc[pos,'WeightLost'].mean()
        neg_avg = f.loc[neg,'WeightLost'].mean()
        return pd.Series([pos.sum(), neg.sum(), pos_avg, neg_avg], index=['Pos Count', 'Neg Count','Pos Avg','Neg Avg'])
    
    f = df.groupby('Year').apply(func).reset_index()
    
    print(f)
    
      Year  Pos Count  Neg Count  Pos Avg  Neg Avg
    0  2015        1.0        2.0      7.8   -10.95
    1  2016        1.0        1.0     10.3   -22.10
    

    【讨论】:

      【解决方案2】:

      既然你想要你的原始df,我们可以利用地图。

      def map_year_stats(df):
      
          col = 'Weight_Lost/Gained'
      
      
          rule_pos = df[col] > 0
      
          rule_neg = df[col] < 0
      
          pos_count = df[rule_pos].groupby('Year')[col].count()
          neg_count = df[rule_neg].groupby('Year')[col].count()
      
          pos_avg = df[rule_pos].groupby('Year')[col].mean()
          neg_avg = df[rule_neg].groupby('Year')[col].mean()
      
          df['pos_count'] = df['Year'].map(pos_count)
          df['neg_count'] = df['Year'].map(neg_count)
          df['pos_avg'] = df['Year'].map(pos_avg)
          df['neg_avg'] = df['Year'].map(neg_avg)
          return df
      

      df_new = map_year_stats(df)
      
        Person  Year  Weight_Lost/Gained  pos_count  neg_count  pos_avg  neg_avg
      0    Joe  2015                -5.7          1          2      7.8   -10.95
      1  Bryan  2015                 7.8          1          2      7.8   -10.95
      2  Kelly  2015               -16.2          1          2      7.8   -10.95
      3  Frank  2016                10.3          1          1     10.3   -22.10
      4   Bill  2016               -22.1          1          1     10.3   -22.10
      

      【讨论】:

      • 感谢您的回答!
      猜你喜欢
      • 1970-01-01
      • 2017-01-26
      • 1970-01-01
      • 1970-01-01
      • 2022-10-25
      • 2013-07-14
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多