【发布时间】:2021-04-10 05:25:06
【问题描述】:
我有像下面这样的df
目标输出
我尝试了下面的代码,但它会得到一列的输出,我必须添加 for 循环才能获得整个结果
我有大数据,有什么快速的解决方案
data = {'item':["y1","y2","y3","y4","y5","y6","y7","y8","y9","y10"],
'X1': [1,1,1,1,1,7,7,7,5,4],
'X2': [8,9,10,10,10,8,8,10,8,9],
'X3': [11,12,13,11,11,11,11,11,1,2],
}
df = pd.DataFrame(data, columns = ['item', 'X1','X2','X3'])
# get count of unique values
df['X1'].nunique()
# get max Value
df['X1'].value_counts().idxmax()
# get percentage of max value
df['X1'].value_counts().max()/df['X1'].size
# get Second value of Max Value
(df.nlargest(2, ['X1'])['X1']).value_counts().idxmax()
# Get Second Value of %
df['X1'][df['X1']==(df.nlargest(2, ['X1'])['X1']).value_counts().idxmax()].size/df['X1'].size
【问题讨论】:
标签: python pandas dataframe bigdata