【发布时间】:2020-12-04 17:23:33
【问题描述】:
在 DataFrame 中,我有多个 Date 列,如下所示。
+---+----------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+
| | order_id | dateColumn1 | dateColumn2 | dateColumn3 | dateColumn4 | dateColumn5 | dateColumn6 | dateColumn7 |
+---+----------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+
| 0 | 22161577 | 2019-03-26 | | | 2019-03-24 | | 2019-03-23 | 2019-03-26 |
| 1 | 22719649 | 2019-05-22 | | 2019-05-21 | 2019-05-28 | | 2019-05-24 | 2019-05-28 |
| 2 | 23693247 | 2019-08-13 | | 2019-08-02 | 2019-08-13 | | 2019-08-10 | 2019-08-07 |
| 3 | 24269635 | 2019-09-23 | | 2019-10-04 | 2019-09-25 | | 2019-10-04 | |
| 4 | 24887829 | 2019-11-12 | | 2020-11-27 | 2020-11-27 | | 2020-11-27 | |
| 5 | 25325867 | 2019-12-19 | | 2019-12-19 | 2019-12-19 | | 2019-12-19 | |
| 6 | 26096549 | 2020-03-18 | | 2020-06-25 | 2020-04-20 | | 2020-06-25 | 2020-06-25 |
+---+----------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+
尝试在每行的给定日期中查找最新日期,并将其更新到新列中,如下所示。在少数情况下,dateColumns 2 和 5 也有日期。
+---+----------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+------------+
| | order_id | dateColumn1 | dateColumn2 | dateColumn3 | dateColumn4 | dateColumn5 | dateColumn6 | dateColumn7 | latestDate |
+---+----------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+------------+
| 0 | 22161577 | 2019-03-26 | | | 2019-03-24 | | 2019-03-23 | 2019-03-26 | 2019-03-26 |
| 1 | 22719649 | 2019-05-22 | | 2019-05-21 | 2019-05-28 | | 2019-05-24 | 2019-05-28 | 2019-05-28 |
| 2 | 23693247 | 2019-08-13 | | 2019-08-02 | 2019-08-13 | | 2019-08-10 | 2019-08-07 | 2019-08-13 |
| 3 | 24269635 | 2019-09-23 | | 2019-10-04 | 2019-09-25 | | 2019-10-04 | | 2019-10-04 |
| 4 | 24887829 | 2019-11-12 | | 2020-11-27 | 2020-11-27 | | 2020-11-27 | | 2020-11-27 |
| 5 | 25325867 | 2019-12-19 | | 2019-12-19 | 2019-12-19 | | 2019-12-19 | | 2019-12-19 |
| 6 | 26096549 | 2020-03-18 | | 2020-06-25 | 2020-04-20 | | 2020-06-25 | 2020-06-25 | 2020-06-25 |
+---+----------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+------------+
我们在 Pandas DataFrame 中是否有任何内置函数来查找最新日期,比较多个日期列。验证此question。然而,寻找替代方法或函数来直接比较多个日期列值并获取最新日期。 DataFrame 包含大约 25K 行。
【问题讨论】:
-
df.filter(like='date').max(axis=1)?
标签: python pandas dataframe date-comparison