【发布时间】:2020-05-17 20:36:28
【问题描述】:
我有一张包含客户购买历史记录的销售表。我想制作一个按客户分组的新数据框。数据框还应包含一列,其中包含客户已购买的所有产品的 value_counts 字典以及每种产品的数量。
我做了以下事情:
categories = data.groupby(by=['CustomerID']).Description.value_counts().to_frame().rename(columns={'Description':'Counts'}).reset_index(level='Description')
产生这个:
Description Counts
CustomerID
3004000304 MAJOR APPLIANCES 3
3004000304 HOME OFFICE 2
3004000304 ACCESSORIES 1
3004002756 MAJOR APPLIANCES 1
3004002946 HOME OFFICE 2
3004002946 ACCESSORIES 1
3004002946 MAJOR APPLIANCES 1
我试过看看是否可以像这样修复上述数据框:
categories['Merged'] = categories.apply(lambda x: {x['Description']:x['Counts']}, axis=1)
这给了我这个:
Description Counts Merged
CustomerID
3004000304 MAJOR APPLIANCES 3 {'MAJOR APPLIANCES': 3}
3004000304 HOME OFFICE 2 {'HOME OFFICE': 2}
3004000304 ACCESSORIES 1 {'ACCESSORIES': 1}
3004002756 MAJOR APPLIANCES 1 {'MAJOR APPLIANCES': 1}
3004002946 HOME OFFICE 2 {'HOME OFFICE': 2}
3004002946 ACCESSORIES 1 {'ACCESSORIES': 1}
3004002946 MAJOR APPLIANCES 1 {'MAJOR APPLIANCES': 1}
但我想要这个:
Counts
CustomerID
3004000304 {'MAJOR APPLIANCES': 3, 'HOME OFFICE': 2, 'ACCESSORIES': 1}
3004002756 {'MAJOR APPLIANCES': 1}
3004002946 {'HOME OFFICE': 2, 'ACCESSORIES': 1, 'MAJOR APPLIANCES': 1}
非常感谢您对生成上述数据框的帮助
【问题讨论】:
标签: python-3.x pandas numpy dataframe jupyter-notebook