用 pandas 计算分组的总数

【问题标题】：Count the total of a grouped by with pandas用 pandas 计算分组的总数
【发布时间】：2020-11-11 18:22:58
【问题描述】：

details = { 
    'order_number' : ['#1', '#2', '#3', '#4','#4'], 
    'disc_code' : ['no_discount', 'superman', 'hero', 'numero_uno','numero_uno'], 
    }
df = pd.DataFrame(details)

len(df) --> 6408
每一行都归属于一个产品，而不是一个交易。如果我将每一行分组到每个订单名称，则有 3560 行。 len(df.groupby('order_number')) --> 3560

我想计算总共使用了多少折扣代码。（如果没有使用折扣码，则值为'no_discount'）

在 SQL 中，语法大概是这样的：

SELECT COUNT(*)
FROM transactions
GROUP BY order_number
WHERE discount_code != 'no_discount'

【问题讨论】：

groupby.nunique?

标签： python pandas pandas-groupby

【解决方案1】：

如果需要按order_number 计数，请使用boolean indexing 和GroupBy.size：

df1 = (df[df['disc_code'].ne('no_discount')]
           .groupby('order_number')
           .size()
           .reset_index(name='count'))
print (df1)
  order_number  count
0           #2      1
1           #3      1
2           #4      2

如果需要计算所有值，则仅按条件计算 Trues 值，不等于 Series.ne 和 sum：

out = df['disc_code'].ne('no_discount').sum()

【讨论】：

如何得到这个dataframe的总数？
@Luc - 你觉得print (df['count'].sum()) 吗？
@Luc - 你能从样本数据中添加预期的输出吗？