【发布时间】:2021-03-07 17:01:00
【问题描述】:
我有一个包含 6 列的数据框,我想使用其中包含日期的 5 列(即第一次旅行、第二次旅行、第三次旅行)。从这 5 列中,我想考虑最大日期并将其与给定日期“2020-09-25 00:00:00”
进行比较必须满足以下条件:
- 如果日期大于输入日期,则很好。如果不是,我们将不得不在名为 RESULT 的新列中写为“Offer Expired”。
,
Customer Name FIRST TRAVEL SECOND TRAVEL THIRD TRAVEL FOURTH TRAVEL FIFTH TRAVEL RESULT
0 USER1 NaT 2020-09-02 08:21:59 NaT NaT NaT Offer Expired
1 USER2 NaT 2014-11-05 15:23:38 NaT NaT NaT Offer Expired
2 USER3 NaT NaT NaT NaT NaT NaN
3 USER4 NaT NaT NaT NaT NaT NaN
4 USER5 NaT NaT NaT NaT NaT NaN
5 USER6 NaT NaT NaT NaT NaT NaN
6 USER7 NaT NaT NaT NaT NaT NaN
7 USER8 NaT NaT NaT NaT NaT NaN
8 USER9 NaT 2020-09-02 10:07:11 NaT NaT NaT Offer Expired
9 USER10 2020-03-16 00:00:00 NaT NaT NaT NaT Offer Expired
10 USER11 2019-12-11 00:00:00 NaT NaT NaT NaT Offer Expired
11 USER12 2020-09-26 00:00:00 2020-04-14 00:00:00 NaT NaT NaT NaN
12 USER13 2020-04-20 00:00:00 2019-10-18 00:00:00 NaT NaT NaT Offer Expired
13 USER14 2020-02-21 00:00:00 2020-04-20 00:00:00 NaT NaT NaT Offer Expired
14 USER15 2020-01-17 00:00:00 2019-10-17 00:00:00 NaT NaT NaT Offer Expired
15 USER16 NaT 2020-04-20 00:00:00 NaT NaT NaT Offer Expired
16 USER17 NaT 2019-08-24 00:00:00 NaT NaT NaT Offer Expired
17 USER18 NaT 2019-11-01 00:00:00 NaT NaT NaT Offer Expired
18 USER19 NaT 2019-09-13 00:00:00 NaT NaT NaT Offer Expired
19 USER20 NaT 2020-01-13 00:00:00 NaT NaT NaT Offer Expired
20 USER21 NaT 2019-09-13 00:00:00 NaT NaT NaT Offer Expired
21 USER22 NaT 2020-04-20 00:00:00 NaT NaT NaT Offer Expired
22 USER23 NaT 2020-02-12 00:00:00 NaT NaT NaT Offer Expired
23 USER24 NaT 2019-10-18 00:00:00 NaT NaT NaT Offer Expired
24 USER25 2020-09-06 22:09:22 2020-04-07 00:00:00 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36 Offer Expired
25 USER26 2020-09-06 22:09:22 2020-04-21 00:00:00 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36 Offer Expired
26 USER27 NaT NaT NaT 2020-09-04 17:03:20 2020-06-03 19:45:36 Offer Expired
27 USER28 NaT NaT NaT 2020-09-04 17:03:20 2020-06-03 19:45:36 Offer Expired
28 USER29 2020-09-06 22:09:22 2020-04-17 00:00:00 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36 Offer Expired
29 USER30 2020-09-06 22:09:22 NaT NaT NaT 2020-06-03 19:45:36 Offer Expired
30 USER31 2020-09-06 22:09:22 NaT NaT NaT 2020-06-03 19:45:36 Offer Expired
31 USER32 2020-09-06 22:09:22 NaT NaT NaT 2020-06-03 19:45:36 Offer Expired
32 USER33 2020-09-06 22:09:22 NaT NaT NaT 2020-06-03 19:45:36 Offer Expired
33 USER34 2020-09-06 22:09:22 2020-10-27 00:00:00 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36 NaN
34 USER35 2020-09-06 22:09:22 2019-06-18 00:00:00 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36 Offer Expired
35 USER36 2020-09-06 22:09:22 2020-04-15 00:00:00 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36 Offer Expired
36 USER37 2020-09-06 22:09:22 2020-09-04 15:29:45 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36 Offer Expired
37 USER38 2020-09-06 22:09:22 NaT NaT 2020-09-25 17:03:20 2020-06-03 19:45:36 NaN
38 USER39 NaT NaT NaT 2020-09-04 17:03:20 2020-06-03 19:45:36 Offer Expired
注意:这在 Excel 中更简单,我们可以使用以下公式。但是,我找不到这样做的 Pandas 方法。
=IF(COUNTBLANK($B2:$F2)=5,"", IF(MAX($B2:$F2)>$H$1,"","Offer Expired"))
感谢任何帮助。
【问题讨论】:
-
我将首先应用 pandas melt 函数来获取一个只有两列“用户”和“旅行日期”的新数据框,请参阅以下Pandas Melt with Multiple Value Vars 的方法。然后,您可以按日期排序并按用户分组,轻松找到每个用户的旅行日期,并确定他们是否属于您的时间范围。
标签: python-3.x pandas dataframe