【问题标题】:Find overlapping timelines for each group in R在 R 中查找每个组的重叠时间线
【发布时间】:2020-05-26 07:07:48
【问题描述】:

R 大师,

我需要您的帮助来使用 tidyverse/dplyr 确定 R 中的多个重叠时间线。

这是数据集:

library(tidyverse)
library(googleVis)

df <- data.frame(Student = structure(c(rep("Allan",5), rep("Joan",5), rep("Kat", 5)), class = "character"),
                 Course = c(LETTERS[1:5], LETTERS[1:5], LETTERS[1:5]), 
                 Start = structure(c(16713,16768,16725,16758,16780,
                                     16714,16754,16765,16729,16785,
                                     16724,16730,16755,16760,16759), class = "Date"), 
                 End = structure(c(16733,16775,16755,16779,16790,
                                   16744,16762,16780,16760,16795,
                                   16744,16750,16758,16784,16798), class = "Date"))

plot(gvisTimeline(data=df, rowlabel = "Course", 
                  start = "Start", end = "End", 
                  options=list(width=600, height=1000) ))

我想在重叠列中计算以下结果。

df$overlap <- c("AC","BD","AC","BD","",
                "AD","BD","","ABD","",
                "AB","AB","","DE","DE")

    df


   Student Course      Start        End overlap
1    Allan      A 2015-10-05 2015-10-25      AC
2    Allan      B 2015-11-29 2015-12-06      BD
3    Allan      C 2015-10-17 2015-11-16      AC
4    Allan      D 2015-11-19 2015-12-10      BD
5    Allan      E 2015-12-11 2015-12-21        
6     Joan      A 2015-10-06 2015-11-05      AD
7     Joan      B 2015-11-15 2015-11-23      BD
8     Joan      C 2015-11-26 2015-12-11        
9     Joan      D 2015-10-21 2015-11-21      ABD
10    Joan      E 2015-12-16 2015-12-26        
11     Kat      A 2015-10-16 2015-11-05      AB
12     Kat      B 2015-10-22 2015-11-11      AB
13     Kat      C 2015-11-16 2015-11-19        
14     Kat      D 2015-11-21 2015-12-15      DE
15     Kat      E 2015-11-20 2015-12-29      DE

衷心感谢您的时间和帮助!

【问题讨论】:

  • Joan D 也与 A 匹配。

标签: r dplyr tidyverse


【解决方案1】:

您可以通过lubridate 设置Interval 对象,并使用int_overlaps() 来测试两个区间是否重叠。

library(tidyverse)
library(lubridate)

df %>%
  group_by(Student) %>% 
  mutate(overlap = map_chr(interval(Start, End),
                           ~ toString(Course[int_overlaps(., interval(Start, End))])))

#    Student Course Start      End        overlap
#    <fct>   <fct>  <date>     <date>     <chr>  
#  1 Allan   A      2015-10-05 2015-10-25 A, C   
#  2 Allan   B      2015-11-29 2015-12-06 B, D   
#  3 Allan   C      2015-10-17 2015-11-16 A, C   
#  4 Allan   D      2015-11-19 2015-12-10 B, D   
#  5 Allan   E      2015-12-11 2015-12-21 E      
#  6 Joan    A      2015-10-06 2015-11-05 A, D   
#  7 Joan    B      2015-11-15 2015-11-23 B, D   
#  8 Joan    C      2015-11-26 2015-12-11 C      
#  9 Joan    D      2015-10-21 2015-11-21 A, B, D
# 10 Joan    E      2015-12-16 2015-12-26 E      
# 11 Kat     A      2015-10-16 2015-11-05 A, B   
# 12 Kat     B      2015-10-22 2015-11-11 A, B   
# 13 Kat     C      2015-11-16 2015-11-19 C      
# 14 Kat     D      2015-11-21 2015-12-15 D, E   
# 15 Kat     E      2015-11-20 2015-12-29 D, E  

【讨论】:

    【解决方案2】:

    使用map2_chr 的解决方案:

    library(dplyr)
    df %>%
      group_by(Student) %>%
       mutate(overlap = purrr::map2_chr(Start, End, 
                  ~toString(Course[.x >= Start & .x < End | .y > Start & .y < End])))
    
    #  Student    Course Start      End        overlap
    #   <charactr> <chr>  <date>     <date>     <chr>  
    # 1 Allan      A      2015-10-05 2015-10-25 A, C   
    # 2 Allan      B      2015-11-29 2015-12-06 B, D   
    # 3 Allan      C      2015-10-17 2015-11-16 A, C   
    # 4 Allan      D      2015-11-19 2015-12-10 D      
    # 5 Allan      E      2015-12-11 2015-12-21 E      
    # 6 Joan       A      2015-10-06 2015-11-05 A, D   
    # 7 Joan       B      2015-11-15 2015-11-23 B, D   
    # 8 Joan       C      2015-11-26 2015-12-11 C      
    # 9 Joan       D      2015-10-21 2015-11-21 A, B, D
    #10 Joan       E      2015-12-16 2015-12-26 E      
    #11 Kat        A      2015-10-16 2015-11-05 A, B   
    #12 Kat        B      2015-10-22 2015-11-11 A, B   
    #13 Kat        C      2015-11-16 2015-11-19 C      
    #14 Kat        D      2015-11-21 2015-12-15 D, E   
    #15 Kat        E      2015-11-20 2015-12-29 E   
    

    如果需要,您可以将 overlap 列中的单个字符条目替换为空白。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2016-04-29
      • 1970-01-01
      • 2015-01-07
      • 2020-07-13
      • 1970-01-01
      • 2023-01-12
      • 2023-01-10
      相关资源
      最近更新 更多