【问题标题】:Replace all values in WHOLE dataframe if it matches an item in a vector如果它与向量中的项目匹配,则替换整个数据框中的所有值
【发布时间】:2021-03-29 16:34:29
【问题描述】:

我有一个只包含日期的数据框。我希望替换整个数据框中与我的向量中的日期匹配的所有日期,而不仅仅是一列:

holidays
[1] "2022-01-01" "2022-04-15" "2022-04-17" "2022-04-18" "2022-04-27" "2022-05-05" "2022-05-26" "2022-06-05" "2022-06-06" "2022-12-25" "2022-12-26" "2021-04-04"
[13] "2021-04-05" "2021-04-27" "2021-05-05" "2021-05-13" "2021-05-23" "2021-05-24" "2021-12-25" "2021-12-26"

如果我的向量中的日期与我的数据框中的日期匹配,我只找到了删除整行的解决方案,但我希望只用 NA 替换该值,而不是删除任何内容。

此外,如果是周末,我想替换整个数据框中的所有值。我知道如何检查一列,但我似乎无法管理整个数据框。

希望你能帮忙!非常感谢

>dput(T0range)
    structure(list(V1 = structure(c(18708, 18708, 18708, 18708, 18708,
    18708, 18709, 18709, 18709, 18709, 18715, 18715, 18715, 18715,
    18715), class = "Date"), V2 = structure(c(18709, 18709, 18709,
    18709, 18709, 18709, 18710, 18710, 18710, 18710, 18716, 18716,
    18716, 18716, 18716), class = "Date"), V3 = structure(c(18710,
    18710, 18710, 18710, 18710, 18710, 18711, 18711, 18711, 18711,
    18717, 18717, 18717, 18717, 18717), class = "Date"), V4 = structure(c(18711,
    18711, 18711, 18711, 18711, 18711, 18712, 18712, 18712, 18712,
    18718, 18718, 18718, 18718, 18718), class = "Date"), V5 = structure(c(18712,
    18712, 18712, 18712, 18712, 18712, 18713, 18713, 18713, 18713,
    18719, 18719, 18719, 18719, 18719), class = "Date"), V6 = structure(c(18713,
    18713, 18713, 18713, 18713, 18713, 18714, 18714, 18714, 18714,
    18720, 18720, 18720, 18720, 18720), class = "Date"), V7 = structure(c(18714,
    18714, 18714, 18714, 18714, 18714, 18715, 18715, 18715, 18715,
    18721, 18721, 18721, 18721, 18721), class = "Date"), V8 = structure(c(18715,
    18715, 18715, 18715, 18715, 18715, 18716, 18716, 18716, 18716,
    18722, 18722, 18722, 18722, 18722), class = "Date"), V9 = structure(c(18716,
    18716, 18716, 18716, 18716, 18716, 18717, 18717, 18717, 18717,
    18723, 18723, 18723, 18723, 18723), class = "Date"), V10 = structure(c(18717,
    18717, 18717, 18717, 18717, 18717, 18718, 18718, 18718, 18718,
    18724, 18724, 18724, 18724, 18724), class = "Date"), V11 = structure(c(18718,
    18718, 18718, 18718, 18718, 18718, 18719, 18719, 18719, 18719,
    18725, 18725, 18725, 18725, 18725), class = "Date"), V12 = structure(c(18719,
    18719, 18719, 18719, 18719, 18719, 18720, 18720, 18720, 18720,
    18726, 18726, 18726, 18726, 18726), class = "Date"), V13 = structure(c(18720,
    18720, 18720, 18720, 18720, 18720, 18721, 18721, 18721, 18721,
    18727, 18727, 18727, 18727, 18727), class = "Date"), V14 = structure(c(18721,
    18721, 18721, 18721, 18721, 18721, 18722, 18722, 18722, 18722,
    18728, 18728, 18728, 18728, 18728), class = "Date"), V15 = structure(c(18722,
    18722, 18722, 18722, 18722, 18722, 18723, 18723, 18723, 18723,
    18729, 18729, 18729, 18729, 18729), class = "Date")), row.names = c(NA,
    -15L), class = "data.frame")
    > dput(holidays)
    structure(c(18993, 19097, 19099, 19100, 19109, 19117, 19138,
    19148, 19149, 19351, 19352, 18721, 18722, 18744, 18752, 18760,
    18770, 18771, 18986, 18987), class = "Date")
     

【问题讨论】:

  • 您能否与dput() 分享您的示例数据?知道您的日期是 Date class 还是 POSIX class 或其他什么东西非常有用。一般可以直接替换,比如df$x[df$x == y] <- NAdf 中的x 值替换为NA,只要它在y 中。日期的想法没有什么不同,df$date_column[df$date_column %in% holidays] <- NA,但它需要你的日期列和你的holidays 向量之间有一个class 匹配......这就是我要求更多细节的原因。
  • 您好 Gregor,我将 dput 添加到原始问题中。在您的 df$date_column[df$date_column %in% holiday]
  • 因为它们都是日期列T0range[T0range %in% holidays] <- NA 应该这样做。您的示例数据没有任何匹配项...
  • 嗨 Gregor,我也试过这个。日期 18722(或 05-04-2021)出现了好几次,主要是在 V8 列中.. 我错过了什么吗?感谢您的帮助!
  • 嗯,看起来数据框结构与日期%in% operator 混淆了。嗯……

标签: r dataframe replace


【解决方案1】:

不幸的是,我认为最简单的方法是跨栏lapply

T0range[] = lapply(T0range, function(x){ x[x %in% holidays] <- NA; x})

【讨论】:

  • 这太完美了!只要它有效,我就很高兴。非常感谢 :) 你有没有机会知道我的第二个问题的答案?关于工作日功能以及如果它是星期六或星期日也替换所有值?
  • DebbieOomen,查看返回特定日期的工作日的weekdays 函数。例如,x[weekdays(x) %in% c("Sunday","Saturday")].
  • 同样的方法,比如T0range[] = lapply(T0range, function(x){ x[weekdays(x) %in% c("Saturday", "Sunday")] &lt;- NA; x})
猜你喜欢
  • 1970-01-01
  • 2020-09-23
  • 2018-06-16
  • 1970-01-01
  • 2015-02-07
  • 1970-01-01
  • 2022-11-25
  • 2020-11-24
  • 1970-01-01
相关资源
最近更新 更多