【发布时间】:2021-02-15 08:16:30
【问题描述】:
我有一个关于数百个国家危机的数据集 (df1),其中每个观察结果都是国家层面的危机事件,具有开始和结束日期。我还有宣布危机的日期(yyyy-mm-dd 格式),以及一系列其他危机特征。
df1 <- data.frame(cbind(eventID=c(1,2,3,4), country=c("ALB","ALB","ARG","ARG"), start=c(1994, 1998, 1998, 1991), end=c(1996,1999,1999,1993), announcement=c("1994-11-01","1998-03-01","1998-07-01","1992-01-01"), x1=c(6,2,8,7), x2=c("a","q","k","b")))
eventID country start end announcement x1 x2
1 ALB 1994 1996 1994-11-01 6 a
2 ALB 1998 1999 1998-03-01 2 q
3 ARG 1998 1999 1998-07-01 8 k
4 ARG 1991 1993 1992-01-01 7 b
我需要制作 df2,这是一组国家/地区,从最早的“开始”年到最近的“结束”年进行年度观察。我想要一个虚拟变量“危机”,对于 df1 中的“开始”和“结束”之间的年份,它等于 1,否则为 0。我希望“公告”在 df1 中包含公告日期,并包含公告,否则为“NA”。我希望额外的危机特征 x1 和 x2 显示它们对应的危机年份,否则显示“NA”。
我还需要观察每个国家在没有国家发生危机的年份(在 df2: 1997 中)。
df2 <- data.frame(cbind(year=c(1991,1992,1993,1994,1995,1996,1997,1998,1999,1991,1992,1993,1994,1995,1996,1997,1998,1999), country=c("ALB","ALB","ALB","ALB","ALB","ALB","ALB","ALB","ALB","ARG","ARG","ARG","ARG","ARG","ARG","ARG","ARG","ARG"),crisis=c(0,0,0,1,1,1,0,1,1,1,1,1,0,0,0,0,1,1), announcement=c(NA, NA,NA,"1994-11-01",NA,NA,NA,"1998-03-01",NA,NA,"1992-01-01",NA,NA,NA,NA,NA,"1998-07-01"), x1=c(NA,NA,NA,6,6,6,NA,2,2,8,8,8,NA,NA,NA,NA,7,7), x2=c(NA,NA,NA,"a","a","a",NA,"q","q","k","k","k",NA,NA,NA,NA,"b","b")))
year country crisis announcement x1 x2
1991 ALB 0 NA NA NA
1992 ALB 0 NA NA NA
1993 ALB 0 NA NA NA
1994 ALB 1 1994-11-01 6 a
1995 ALB 1 NA 6 a
1996 ALB 1 NA 6 a
1997 ALB 0 NA NA NA
1998 ALB 1 1998-03-01 2 q
1999 ALB 1 NA 2 q
1991 ARG 1 NA 8 k
1992 ARG 1 1992-01-01 8 k
1993 ARG 1 NA 8 k
1994 ARG 0 NA NA NA
1995 ARG 0 NA NA NA
1996 ARG 0 NA NA NA
1997 ARG 0 NA NA NA
1998 ARG 1 1998-07-01 7 b
1999 ARG 1 NA 7 b
我会喜欢任何建议!我对如何复制每年的观察结果感到困惑,但当我的新“危机”假人 = 1 时只包括 x1 和 x2 值
谢谢!
【问题讨论】:
-
您能否提供
dput()或data.frame()您的示例数据?
标签: r data-wrangling replicate