【问题标题】:Create a dummby variable based on the rolling values of a categorical variable over time (i.e., date)根据分类变量随时间(即日期)的滚动值创建虚拟变量
【发布时间】:2022-12-21 14:04:08
【问题描述】:

假设我有以下data

date name rolename firmname
2011-12-01 John helper A
2012-12-01 John helper A
2013-12-01 John helper A
2014-12-01 John helper B
2014-12-01 John senior manager C
2015-12-01 John helper B
2015-12-01 John senior manager C
2016-12-01 John senior manager C
2016-12-01 John senior manager D
2017-12-01 John helper E
2011-12-01 Will senior manager A
2012-12-01 Will senior manager A
2013-12-01 Will senior manager Z

我正在尝试为以前的高级经理经验创建一个虚拟变量 (dummy_sm_exp)。也就是说,dummy_sm_exp 等于 1 时的人曾在其他公司担任高级经理, 否则为 0。例如,对于上述数据,创建第五列采用以下值:

date name rolename firmname dummy_sm_exp
2011-12-01 John helper A 0
2012-12-01 John helper A 0
2013-12-01 John helper A 0
2014-12-01 John helper B 0
2014-12-01 John senior manager C 0
2015-12-01 John helper B 1
2015-12-01 John senior manager C 1
2016-12-01 John senior manager C 1
2016-12-01 John senior manager D 1
2017-12-01 John helper E 1
2011-12-01 Will senior manager A 0
2012-12-01 Will senior manager A 0
2013-12-01 Will senior manager Z 1

请注意,只有当一个人有事先的高级经理工作经验其他公司。有什么提示吗?谢谢。

【问题讨论】:

    标签: r dplyr rolling-computation


    【解决方案1】:

    如果您的数据框按日期排序,您可以使用cumany

    data %>%
      group_by(name) %>%
      mutate(dummy = cumany(rolename == "senior manager"))
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2013-04-14
      • 2011-03-24
      • 2021-07-25
      • 2021-07-24
      • 2016-09-02
      • 2015-12-14
      相关资源
      最近更新 更多