【发布时间】:2020-03-03 14:34:28
【问题描述】:
我正在处理一个大型 csv 文件,其中的信息类似于
id year decade code type
3366 2014 2010 EM Chemical
3366 2014 2010 EM Chemical
3366 2014 2010 EM Chemical
3366 2014 2010 EM Chemical
3366 2014 2010 EM Chemical
427 1972 1970 DR Coastal Storm
337 1972 1970 DR Coastal Storm
337 1972 1970 DR Coastal Storm
我想按“id”列中唯一出现的次数进行排序。我想要的结果看起来像
id year decade code type count
3366 2014 2010 EM Chemical 5
427 1972 1970 DR Coastal Storm 1
337 1972 1970 DR Coastal Storm 2
但是我试图满足于类似的东西
id year decade code type count
3366 2014 2010 EM Chemical 5
3366 2014 2010 EM Chemical 5
3366 2014 2010 EM Chemical 5
3366 2014 2010 EM Chemical 5
3366 2014 2010 EM Chemical 5
427 1972 1970 DR Coastal Storm 1
337 1972 1970 DR Coastal Storm 1
337 1972 1970 DR Coastal Storm 2
我试图通过尝试来做到这一点
df['count']=df.groupby('id').transform('count')
但我不断收到错误
ValueError: Wrong number of items passed 18, placement implies 1
有没有更好的方法来实现这一点?
【问题讨论】:
-
df["count"] = df.groupby("id")["type"].transform("count")?
标签: python pandas dataframe data-science