【发布时间】:2020-06-16 15:51:16
【问题描述】:
所以我有一个名为 df1 的数据框:
df1 <- structure(list(startTime = structure(c(1519903920, 1519905060,
1519913740, 1519919880), class = c("POSIXct", "POSIXt"), tzone = "America/New_York"),
endTime = structure(c(1519904880, 1519912200, 1519913940,
1522142880), class = c("POSIXct", "POSIXt"), tzone = "America/New_York"),
impact = c(92.17, 616.43, 63.69, 14.69), impactPercent = c(184.15,
1495.17, 138.69, 19.97), impactSpeedDiff = c(3587.72, 25726.22,
2616.01, 474.11), maxQueueLength = c(5.76053, 5.76053, 4.829511,
2.447619), tmcs = list(c("110N04623", "110-04623", "110N04624",
"110-04624", "110N04625", "110-04625", "110N04626", "110-04626",
"110N04627"), c("110N04623", "110-04623", "110N04624", "110-04624",
"110N04625", "110-04625", "110N04626", "110-04626", "110N04627"
), c("110N04623", "110-04623", "110N04624", "110-04624",
"110N04625", "110-04625", "110N04626", "110-04626"), c("110N04623",
"110-04623", "110N04624", "110-04624", "110N04625")), early_startTime = structure(c(1519903620,
1519904760, 1519913740, 1522133400), class = c("POSIXct",
"POSIXt"), tzone = "America/New_York")), row.names = c(NA,
4L), class = "data.frame")
鉴于此数据帧,我需要匹配以下数据帧 (df2)。
df2 <- structure(list(created_tstamp = structure(c(1519926899, 1519913840,
1519913840, 1519927924, 1522141200, 1522152619, 1522152708, 1522152728,
1519928416, 1519928785, 1519929080, 1519929306, 1519929964, 1519930050,
1522154148, 1519930311, 1519930139, 1519930470, 1519930660, 1519929579
), class = c("POSIXct", "POSIXt"), tzone = "America/New_York"),
closed_tstamp = structure(c(1519929764, 1519926987, 1519927686,
1519928360, 1522152738, 1522152779, 1522154882, 1522152819,
1519928464, 1519928914, 1519929266, 1519929741, 1519939420,
1519930622, 1522155300, 1519930334, 1519931054, 1519951230,
1519930766, 1519930830), class = c("POSIXct", "POSIXt"), tzone = "America/New_York"),
code = c("110-04508", "110N04623", "110N04623", "110P05583",
"", "", "110N04485", "110N04357", "110-05066", "110-04421",
"110N04421", "110P04577", "110-04204", "110-04269", "110+04673",
"110-04445", "", "110P05797", "110N04269", "110+04520")), row.names = c(NA,
20L), class = "data.frame")
匹配由两个条件共同表示:
-
df2 中的
created_tstamp介于 df1 中的early_startTime和endTime之间 -
df2中的code存在于df1中的同一tmcs单元格中
需要同时满足这两个条件才能被视为匹配。最终,我想创建一个标识符以将 df2 的每一行与其在 df1 中的对应匹配项匹配。这可能是通过某种循环完成的,但我不确定如何编写它。注意:这是数据的子集。
如果 df2 中的数据点与 df1 中的数据点不匹配,则它在标识符列中应为 NA。最后两个 df 都应该得到一个 ID 列。
【问题讨论】:
标签: r loops datetime for-loop dplyr