【发布时间】:2017-03-22 01:10:14
【问题描述】:
我正在转换一些申请人的交易数据,我需要创建一个新的标志列(在我的示例中标记为“DESIRED FLAG”)。但是,我无法找出正确的循环/应用方法,因为下面的逻辑可能有很多不同的变化。
在一个完美的世界中,顺序申请流程历史看起来像这样,所有“状态”都设置为“已完成”:
- 现场面试开始 --> 安排面试 --> 决定;或
- 电话面试开始 --> 安排面试 --> 决定
当然,申请人在申请过程中可以进行多次电话面试和现场面试。
如下例所示,有时会取消“安排面试”。在这些情况下,我需要删除该步骤以及与之相关的后续步骤。其中包括“安排面试”、“决定”和“现场面试开始”或“电话面试开始”。此外,有时可能还有其他“事件”,就像我们在手动跳过的事件中看到的那样。
我需要为其他类型的场景创建标志,因此我需要保留原始数据框和新列。
import pandas as pd
data = {'Employee ID': ["100","100", "100", "100","100","100","100","100","100","100","200", "200", "200","200","200","200","200","300","300", "300", "300","300","300","300"],
'Completed On Date': ["2009-01-01","2010-01-01","2011-06-05","2012-07-01","2013-01-01","2014-01-01","2015-01-01","2016-01-01","2017-01-01","2018-01-01","2010-01-01","2011-06-05","2012-07-01","2012-08-15","2013-01-01","2014-01-01","2015-01-01","2009-01-01","2010-01-01","2011-06-05","2012-07-01","2013-01-01","2014-01-01","2015-01-01"],
'Event': ["Decision","On-Site Interview Kick Off","Schedule Interviews","Decision","On-Site Interview Kick Off","Schedule Interviews","Decision","Phone Interview Kick Off","Schedule Interviews","Decision","On-Site Interview Kick Off","Schedule Interviews","Decision","Decision","Phone Interview Kick Off","Schedule Interviews","Decision","Job Apply","Phone Interview Kick Off","Schedule Interviews","Decision","On-Site Interview Kick Off","Schedule Interviews","Decision"],
'Event Status': ["Completed","Completed","CANCELED","Completed","Completed","Completed","Completed","Completed","Completed","Completed","Completed","CANCELED","Manually Skipped","Completed","Completed","Completed","Completed","Completed","Completed","CANCELED","Completed","Completed","Completed","Completed"],
'DESIRED FLAG': ["Keep","Keep","Remove","Remove","Remove","Keep","Keep","Keep","Keep","Keep","Keep","Remove","Remove","Remove","Remove","Keep","Keep","Keep","Keep","Remove","Remove","Remove","Keep","Keep"]}
df = pd.DataFrame(data, columns=['Employee ID','Completed On Date','Event','Event Status','DESIRED FLAG'])
df = df.sort_values(by=(['Employee ID','Completed On Date']))
df
【问题讨论】:
-
如果您可以发布所需的输出,这将非常有帮助。
-
参见“DESIRED FLAG”列。这就是输出的样子。谢谢!
-
知道了。有助于以数据框的形式将其可视化,但也许这只是我。
-
Np。我从来不知道如何在这个论坛上输出 DF! :O
标签: python python-3.x loops pandas