【问题标题】:How to aggregate one column based on another column in Pandas如何根据 Pandas 中的另一列聚合一列
【发布时间】:2022-01-20 09:56:25
【问题描述】:
    year   fruit  sales
0  2010   Apple     10
1  2011   Apple     20
2  2010  Banans     50000
3  2011  Banans     30

我想要的是这样的:

    fruit   min_year  sales_2010  max_year  sales_2011
0   Apple       2010          10      2011          20
1   Banans      2010       50000      2011          30

【问题讨论】:

    标签: python pandas dataframe pandas-groupby


    【解决方案1】:

    首先将minmax 聚合到df1DataFrame.add_suffix,然后通过DataFrame.pivotDataFrame.add_prefix 进行旋转,最后通过concat 连接在一起:

    df1 = df.groupby('fruit')['year'].agg(['min','max']).add_suffix('_year')
    df2 = df.pivot('fruit','year','sales').add_prefix('sales_')
    
    df = pd.concat([df1, df2], axis=1)
    print (df)
            min_year  max_year  sales_2010  sales_2011
    fruit                                             
    Apple       2010      2011          10          20
    Banans      2010      2011       50000          30
    

    【讨论】:

      【解决方案2】:

      一个选项:

      (df
       .pivot("fruit", "year", "sales")
       .assign(min_year=lambda df: df.columns.min(), 
               max_year=lambda df: df.columns[:-1].max())
       .rename(columns=lambda col: f"sales_{col}" 
                                   if isinstance(col, int) 
                                   else col)
       .rename_axis(columns=None)
       .reset_index()
      )
      
          fruit  sales_2010  sales_2011  min_year  max_year
      0   Apple          10          20      2010      2011
      1  Banans       50000          30      2010      2011
      
      

      另一种可能更有效的选择:

      grouper = df.groupby('fruit')
      (df
      .assign(min_year=grouper.year.transform("min"), 
              max_year=grouper.year.transform("max"))
      .pivot(["fruit", "min_year", "max_year"], "year", "sales")
      .add_prefix("sales_")
      .rename_axis(columns=None)
      .reset_index()
      )
      
          fruit  min_year  max_year  sales_2010  sales_2011
      0   Apple      2010      2011          10          20
      1  Banans      2010      2011       50000          30
      
      

      【讨论】:

        猜你喜欢
        • 2020-09-24
        • 1970-01-01
        • 1970-01-01
        • 2021-04-30
        • 2021-07-24
        • 2022-11-10
        • 2020-12-08
        • 2018-03-01
        • 1970-01-01
        相关资源
        最近更新 更多