【发布时间】:2021-12-11 08:05:44
【问题描述】:
您好,我希望删除按取消日期排序的记录,因此我只会对最近的记录感兴趣。
样本数据
| id | cancel_date | type_of_fruit |
|---|---|---|
| 1 | 2021-03-02 | apple |
| 1 | 2021-01-01 | apple |
| 2 | 2021-02-01 | orange |
预期输出
| id | cancel_date | type_of_fruit |
|---|---|---|
| 1 | 2021-03-02 | apple |
| 2 | 2021-02-01 | orange |
我写了SQL方式但是我必须在pandas中实现这个逻辑,请帮助
SELECT
*
FROM
(SELECT *,
rank() over(partition by id order by cancel_date desc) as rank
FROM df
ORDER BY id, cancel_date DESC) a
where rank = 1
【问题讨论】:
标签: python sql pandas database dataframe