熊猫获得groupby的百分比值[重复]答案

【问题标题】：Pandas get percent value of groupby [duplicate]熊猫获得groupby的百分比值[重复]
【发布时间】：2019-05-21 11:57:06
【问题描述】：

我已经完成了一个 pandas groupby

grouped = df.groupby(['name','type'])['count'].count().reset_index()

看起来像这样：

name  type    count
x     a       32
x     b       1111
x     c       4214

我需要做的是获取这个并生成百分比，所以我会得到这样的东西（我意识到百分比不正确）：

name  type  count
x     a     1%
x     b     49%
x     c     50%

我能想到一些可能有意义的伪代码，但我无法得到任何真正有效的东西......

类似

def getPercentage(df):
    for name in df: 
        total = 0
        where df['name'] = name:
            total = total + df['count'] 
            type_percent = (df['type'] / total) * 100
            return type_percent

df.apply(getPercentage)

有没有用熊猫做这件事的好方法？

【问题讨论】：

你能提供一个简短的输入样本和你期望给定样本的输出吗？

标签： python pandas percentage

【解决方案1】：

试试：

df.loc[:,'grouped'] = df.groupby(['name','type'])['count'].count() / df.groupby(['name','type'])['count'].sum()

【讨论】：

【解决方案2】：

任何序列都可以通过传入一个参数“normalize=False”来规范化，如下所示（它比按计数划分更干净）：

Series.value_counts(normalize=True, sort=True, ascending=False) 因此，它将类似于（这是一个系列，而不是数据框）：

df['type'].value_counts(normalize=True) * 100

或者，如果您使用 groupby，您可以这样做：

total = grouped['count'].sum()
grouped['count'] = grouped['count']/total * 100

【讨论】：

【解决方案3】：

使用crosstab + normalize

pd.crosstab(df.name,df.type,normalize='index').stack().reset_index()

【讨论】：