【发布时间】:2017-03-14 18:34:53
【问题描述】:
我有以下数据框my_df:
name timestamp color
---------------------------
John 2017-01-01 blue
John 2017-01-02 blue
John 2017-01-03 blue
John 2017-01-04 yellow
John 2017-01-05 red
John 2017-01-06 red
Ann 2017-01-04 green
Ann 2017-01-05 orange
Ann 2017-01-06 orange
Ann 2017-01-07 red
Ann 2017-01-08 black
Dan 2017-01-11 blue
Dan 2017-01-12 blue
Dan 2017-01-13 green
Dan 2017-01-14 yellow
然后我用下面的代码找出每个人的颜色顺序:
new_df = my_df.groupby(['name'], as_index=False).color \
.agg({"color_list": lambda x: list(x)})
然后new_df 看起来像:
name color_list
-----------------------------------------------
John blue, blue, blue, yellow, red, red
Ann green, orange, orange,red, black
Dan blue, blue, green, yellow
但是,如果我想创建一个color_seq(没有重复的颜色)而不是下面的color_list,我该如何修改上面的代码?谢谢!
name color_seq
-----------------------------------------------
John blue, yellow, red
Ann green, orange, red, black
Dan blue, green, yellow
【问题讨论】:
-
lambda x: set(x)?
-
不行,set不能保证顺序。
标签: python-3.x pandas dataframe lambda