如何使用重复索引进行数据透视表答案

【问题标题】：How to pivot_table with with duplicated index如何使用重复索引进行数据透视表
【发布时间】：2018-12-27 11:14:09
【问题描述】：

我有一个这样的df_，

name  level  status
yes   high   open
no    high   closed
no    med    closed
yes   low    open
no    med    rejected
no    high   open

我正在尝试使用index='level',columns='status', values=sum of occurances with respect to the column and index 创建一个数据透视表

我的代码：

df_['temp']=df_['level'].astype(bool).astype(int)
df_.pivot(index='level',columns='status',values='temp')

但是给了我，ValueError: Index contains duplicate entries, cannot reshape

我的预期输出是，

      open closed rejected
high  2    1      0
med   0    1      1
low   1    0      0

请检查并告诉我是否有其他简单的方法。

【问题讨论】：

为我工作df = pd.crosstab(df['level'], df['status'])
我用df_.pivot_table(index='level',columns='status',values='temp',aggfunc='sum') 做的，我会检查你的代码@jezrael
需要size，而不是sum
工作正常，谢谢@jezrael
如果我给 df_.pivot_table(index='level',columns='status',values='temp',aggfunc='size) 为什么会收到 keyerror 'temp'？

标签： python pandas dataframe data-analysis

【解决方案1】：

一种更简单的方法是计算name 的出现次数：

df_.pivot_table(values='name',
                index='level',
                columns='status',
                aggfunc='count',
                fill_value=0)

【讨论】：