【发布时间】:2021-05-02 05:22:12
【问题描述】:
我有一个如下的数据框:
payeeId amount createdAt TrxnID
1001 2.30 2021-04-24 01:40:11.156000+00:00 100AA
1001 35 2021-04-24 02:10:11.146000+00:00 100AB
1001 600 2021-04-24 02:12:14.309000+00:00 100AC
1002 100 2021-04-24 02:59:51.127000+00:00 110BD
1003 1900 2021-04-24 04:09:15.113000+00:00 120AC
1003 10 2021-04-24 04:19:40.132000+00:00 120AM
我想添加一个具有以下逻辑的标志:
If for a given 'PayeeId', the difference between two consecutive 'createdAt' is less than 300 seconds, then the flag will be set to 'No Settlement', else 'Approved'
所以生成的 Dataframe 看起来像
payeeId amount createdAt TrxnID Flag
1001 2.30 2021-04-24 01:40:11.156000+00:00 100AA Approved
1001 35 2021-04-24 02:10:11.146000+00:00 100AB Approved
1001 600 2021-04-24 02:12:14.309000+00:00 100AC Not Approved
1002 100 2021-04-24 02:59:51.127000+00:00 110BD Approved
1003 1900 2021-04-24 04:09:15.113000+00:00 120AC Approved
1003 10 2021-04-24 04:19:40.132000+00:00 120AM Approved
所以我正在尝试使用以下代码 sn-ps:
gs = df.groupby(['payeeId'])['createdAt']
df['Time_Diff'] = gs.diff().fillna(pd.Timedelta(seconds=0))/pd.Timedelta(seconds=300)
df['Flag'] = np.where(df_sub_count['Time_Diff']>0,'Approved','No Settlement')
但是上面的没有产生预期的结果。我看到 payeeID 1002 的“无结算”。这是不可取的。
我在这里错过了什么。
【问题讨论】: