计数到第一列并总结到其余列 pandas groupby答案

【问题标题】：Count to first column and sum to the rest of the columns pandas groupby计数到第一列并总结到其余列 pandas groupby
【发布时间】：2021-06-05 06:45:52
【问题描述】：

我有一个包含 290 列的 pandas DataFrame df。

有没有办法根据以下规则进行.groupby操作：

第二列的求和运算。
将操作计数到第 3 列。
对所有其他列的平均操作

我知道我可以这样使用：

df.groupby("column1") \
    .agg({"column2":"sum", 
          "column3":"count",
          "column4":"mean",
          ...
          "column290":"mean"})

但是使用这种方式完全没有效率，因为我必须输入所有其他列。

有没有办法设置这个操作？就像我没有设置任何 agg 时设置默认函数一样？

【问题讨论】：

请添加具有预期输出的示例数据

标签： python pandas dataframe group-by

【解决方案1】：

df1=df.groupby("column1").agg({"column2":"sum", "column3":"count"})

df2=df.drop(["column2", "column3"], 1).groupby("column1").agg("mean", 1)

df3=pd.concat([df1, df2], 1)

【讨论】：

【解决方案2】：

让我们使用字典：

import pandas as pd
import numpy as np

df=pd.DataFrame(np.arange(100).reshape(10,-1), columns=[*'ABCDEFGHIJ'])

# Defined the first three columns  
aggdict={'A':'sum',
         'B':'sum',
         'C':'count'}

# Use for loop to added to dictoary the rest of the columns. Creating a 
# default aggregation method
for i in df.columns[3:]:
    aggdict[i]='mean'

# Use agg with dictionary
df.groupby(df.index%2).agg(aggdict)

【讨论】：

石斑只需要第一列。