【问题标题】:interval in months between two columns in rr 中两列之间的月间隔
【发布时间】:2020-02-18 13:59:59
【问题描述】:

我有这个数据

"code";"min";"max"
"CM106";2016-12-01;2018-08-01
"CM107";2017-10-01;2019-11-01
"CM109";2017-01-01;2019-02-01
"CM113";2018-02-01;2019-03-01
"CM114";2016-10-01;2017-12-01
"CM118";2018-04-01;2018-11-01
"CM121";2018-05-01;2020-02-01
"CM126";2018-08-01;2018-11-01
"CM129";2017-01-01;2018-04-01
"CM131";2018-09-01;2020-05-01
"CM144";2018-02-01;2019-11-01
"CM150";2018-10-01;2019-04-01
"CM153";2018-05-01;2018-09-01
"CM154";2016-05-01;2019-06-01

日期格式:年-月-日

我想创建一个新列,其中“min”和“max”列之间的间隔以月为单位

我试图遵循这个答案但没有工作Count the months between two dates in a data.table

我明白了:

intervalos[, 2:3 := lapply(.SD, as.IDate, format = "%Y.%m.%d"), .SDcols = 2:3]

[.tbl_df(intervalos, , :=(2:3, lapply(.SD, as.IDate, format = "%Y.%m.%d")) 中的错误: 未使用的参数 (.SDcols = 2:3)

【问题讨论】:

  • 显示你使用的命令

标签: r intervals


【解决方案1】:

1.创建可重现的最小示例

df <- structure(list(c = c("CM106", "CM107", "CM109", "CM113", "CM114", "CM118", "CM121", "CM126", "CM129", "CM131", "CM144", "CM150", "CM153", "CM154"), 
                     min = c("2016-12-01", "2017-10-01", "2017-01-01", "2018-02-01", "2016-10-01", "2018-04-01", "2018-05-01", "2018-08-01", "2017-01-01", "2018-09-01", "2018-02-01", "2018-10-01", "2018-05-01", "2016-05-01"),
                     max = c("2018-08-01", "2019-11-01", "2019-02-01", "2019-03-01", "2017-12-01", "2018-11-01", "2020-02-01", "2018-11-01", "2018-04-01", "2020-05-01", "2019-11-01", "2019-04-01", "2018-09-01", "2019-06-01")),
                class = "data.frame", row.names = c(NA, -14L))

2.使用基础R的解决方案:

使用as.Date

df$min <- as.Date(df$min, "%Y-%m-%d")
df$max <- as.Date(df$max, "%Y-%m-%d")

计算差异:

计算差异:

df$diff_days <- df$max - df$min
df$diff_months <- as.numeric(df$diff_days) /(365.25/12)

df$diff_days 现在是:

Time differences in days
 [1]  608  761  761  393  426  214
 [7]  641   92  455  608  638  182
[13]  123 1126

df$diff_months 是:

 [1] 19.975359 25.002053 25.002053 12.911704 13.995893
 [6]  7.030801 21.059548  3.022587 14.948665 19.975359
[11] 20.960986  5.979466  4.041068 36.993840

【讨论】:

  • 使用引用的 SO 问题中的答案:df$diff_months &lt;- lengths(Map(seq, df$min, df$max, by = "months")) -1
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2012-10-28
  • 1970-01-01
  • 2022-12-07
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多