【发布时间】:2018-03-30 20:17:38
【问题描述】:
我在操作以下数据结构时遇到困难:
属性数据框:
ID Begin_A End_A Interval Value
5 2017-03-01 2017-03-10 2017-03-01 UTC--2017-03-10 UTC Cat1
10 2017-12-01 2017-12-02 2017-12-01 UTC--2017-12-02 UTC Cat2
5 2017-03-01 2017-03-03 2017-03-01 UTC--2017-03-03 UTC Cat3
10 2017-12-05 2017-12-10 2017-12-05 UTC--2017-12-10 UTC Cat4
预订数据框:
ID Begin_A End_A Interval
5 2017-03-03 2017-03-05 2017-03-03 UTC--2017-03-05 UTC
6 2017-05-03 2017-05-05 2017-05-03 UTC--2017-05-05 UTC
8 2017-03-03 2017-03-05 2017-03-03 UTC--2017-03-05 UTC
10 2017-12-05 2017-12-06 2017-12-05 UTC--2017-12-06 UTC
期望的结果框架(预订):
ID Begin_A End_A Interval Attribute_value
5 2017-03-03 2017-03-05 2017-03-03 UTC--2017-03-05 UTC Cat1,Cat3
6 2017-05-03 2017-05-05 2017-05-03 UTC--2017-05-05 UTC NA
8 2017-03-03 2017-03-05 2017-03-03 UTC--2017-03-05 UTC NA
10 2017-12-05 2017-12-06 2017-12-05 UTC--2017-12-06 UTC Cat4
数据框代码:
library(lubridate)
# Attributes data frame:
date1 <- as.Date(c('2017-3-1','2017-12-1','2017-3-1','2017-12-5'))
date2 <- as.Date(c('2017-3-10','2017-12-2','2017-3-3','2017-12-10'))
attributes <- data.frame(matrix(NA,nrow=4, ncol = 5))
names(attributes) <- c("ID","Begin_A", "End_A", "Interval", "Value")
attributes$ID <- as.numeric(c(5,10,5,10))
attributes$Begin_A <-date1
attributes$End_A <-date2
attributes$Interval <-attributes$Begin_A %--% attributes$End_A
attributes$Value<- as.character(c("Cat1","Cat2","Cat3","Cat4"))
### Bookings data frame:
date1 <- as.Date(c('2017-3-3','2017-5-3','2017-3-3','2017-12-5'))
date2 <- as.Date(c('2017-3-5','2017-5-5','2017-3-5','2017-12-6'))
bookings <- data.frame(matrix(NA,nrow=4, ncol = 4))
names(bookings) <- c("ID","Begin_A", "End_A", "Interval")
bookings$ID <- as.numeric(c(5,6,8,10))
bookings$Begin_A <-date1
bookings$End_A <-date2
bookings$Interval <-bookings$Begin_A %--% bookings$End_A
达到我的结果框架的程序应该如下: 从预订中获取 ID,过滤属性 ID 与预订 ID 匹配的属性数据框的所有行。检查哪些具有匹配属性 ID 的行也具有重叠的时间间隔(来自 lubridate 的 int_overlaps)。然后从 Value 列中获取相应的值,并在 Attribute_value 列中打印它们中的每一个。
【问题讨论】: