【发布时间】:2017-02-02 09:05:18
【问题描述】:
我的数据框(DF)看起来像这样
Customer_number Store_number year month last_buying_date1 amount
1 20 2014 10 2015-10-07 100
1 20 2014 10 2015-10-09 200
2 20 2014 10 2015-10-20 100
2 10 2014 10 2015-10-13 500
我想得到这样的输出
year month sum_purchase count_purchases distinct customers
2014 10 900 4 3
如何使用 Agg 和 group by 获得这样的输出。我目前正在使用一个 2 步组,但很难获得不同的客户。这是我的方法
#### Step 1 - Aggregating everything at customer_number, store_number level
aggregations = {
'amount': 'sum',
'last_buying_date1': 'count',
}
grouped_at_Cust = DF.groupby(['customer_number','store_number','month','year']).agg(aggregations).reset_index()
grouped_at_Cust.columns = ['customer_number','store_number','month','year','total_purchase','num_purchase']
#### Step2 - Aggregating at year month level
aggregations = {
'total_purchase': 'sum',
'num_purchase': 'sum',
size
}
Monthly_customers = grouped_at_Cust.groupby(['year','month']).agg(aggregations).reset_index()
Monthly_customers.colums = ['year','month','sum_purchase','count_purchase','distinct_customers']
我的斗争是在第二步。如何在第二个聚合步骤中包含大小?
【问题讨论】:
标签: python pandas group-by aggregate