【发布时间】:2016-01-24 07:53:30
【问题描述】:
假设我有两张桌子。一个是约会,第二个是招待会。每张表都有孩子ID、医生ID、开始和结束时间(约会计划和接待事实)和一些其他数据。我想计算在预约期间的时间间隔内有多少预约有接待。接待事实可以在预约开始时间之前开始,之后,它可以在应用程序内部。间隔等
下面我做了两张桌子。一种用于约会,一种用于接待。我写了嵌套循环,但它的工作速度很慢。我的表每个包含大约 50 行。我需要快速解决这个问题。我怎么能在没有循环的情况下做到这一点?提前致谢!
library(data.table)
date <- as.POSIXct('2015-01-01 14:30:00')
# appointments data table
app <- data.table(med.id = 1:10,
filial.id = rep(c(100,200), each = 5),
start.time = rep(seq(date, length.out = 5, by = "hours"),2),
end.time = rep(seq(date+3599, length.out = 5, by = "hours"),2),
A = rnorm(10))
# receptions data table
re <- data.table(med.id = c(1,11,3,4,15,6,7),
filial.id = c(rep(100, 5), 200,200),
start.time = as.POSIXct(paste(rep('2015-01-01 ',7), c('14:25:00', '14:25:00','16:32:00', '17:25:00', '16:10:00', '15:35:00','15:50:00'))),
end.time = as.POSIXct(paste(rep('2015-01-01 ',7), c('15:25:00', '15:20:00','17:36:00', '18:40:00', '16:10:00', '15:49:00','16:12:00'))),
B = rnorm(7))
app$count <- 0
for (i in 1:dim(app)[1]){
for (j in 1:dim(re)[1]){
if ((app$med.id[i] == re$med.id[j]) & # med.id is equal and
app$filial.id[i] == re$filial.id[j]) { # filial.id is equal
if ((re$start.time[j] < app$start.time[i]) & (re$end.time[j] > app$start.time[i])) { # reception starts before appointment start time and ends after appointment start time OR
app$count[i] <- app$count[i] + 1
} else if ((re$start.time[j] < app$end.time[i]) & (re$start.time[j] > app$start.time[i])) { # reception starts before appointment end time and after app. start time
app$count[i] <- app$count[i] + 1
}
}
}
}
【问题讨论】:
-
试试
?foverlaps。检查here
标签: r merge group-by data.table dplyr