【发布时间】:2022-01-18 18:19:11
【问题描述】:
我有一个带有多级列索引的数据框。我想获取按 group 分组的 g1, g2 列(每个级别 1 索引 (1,2))的交叉表列 (a, b)。我以为我可以只调用顶级列就可以逃脱,但我有点卡住了。我希望最终作为输出的数据帧在下面的 d2 中。欢迎所有的cmets,非常感谢
# the dataframe that I have
d1 = pd.DataFrame((['i1', 'a', 'dog', 'mouse','cat','mouse'],['i2','a','cat','mouse','dog','dog'],['i3', 'a', 'dog', 'dog','cat','dog'],['i4','b','cat','dog','dog','cat']), columns = pd.MultiIndex.from_tuples(list(zip(*[['id','group','g1','g1','g2','g2'], ['-','-','1','2','1','2']]))))
# what I thought would work...
d1 = d1.set_index('id')
d1.groupby(['group'])['g1'].value_counts()
# the dataframe that I would like to have
d2 = pd.DataFrame((['a', 'dog', 2,1,1,2],['a','mouse',0,2,0,1],['a','cat',1,0,2,0],['b','cat',1,0,1,1],['b','dog',0,1,1,1]), columns = pd.MultiIndex.from_tuples(list(zip(*[['group','category','g1','g1','g2','g2'], ['-','-','1','2','1','2']]))))
【问题讨论】:
标签: python pandas dataframe group-by