【问题标题】:filter variable based on prior date value in r根据 r 中的先前日期值过滤变量
【发布时间】:2020-05-25 13:31:15
【问题描述】:

我有一个如下的数据框:

df <- structure(list(id = c("555900339", "555900339", "555900339", 
                               "555900339", "555900339", "555900327", "555900327", "555900327", 
                               "555703505", "555703379", "555703379", "555703379", "555703379", 
                               "555703379", "555703366", "555702668", "555702668", "555702668", 
                               "555702668", "555702668"), date = c("20200207", "20200207", 
                                                                          "20200207", "20200208", "20200208", "20200207", "20200207", "20200207", 
                                                                          "20200207", "20200207", "20200207", "20200207", "20200207", "20200207", 
                                                                          "20200207", "20200207", "20200208", "20200208", "20200208", "20200208"
                               ), flag_code = c("SLEP", "NCHG", "MOTN", "CIHB", "NCON", "SLEP", 
                                              "NCHG", "MOTN", "INMC", "SLEP", "NCHG", "MOTN", "COFF", "NCON", 
                                              "NCHG", "SLEP", "NOMO", "NCON", "MOTN", "CIHB")), row.names = c(NA, 
                                                                                                              -20L), class = c("tbl_df", "tbl", "data.frame"))

我想查看在 NCON 标志前一天有多少条记录(此处为唯一 id)前面有 NCHG 标志代码。我需要这样的东西

  id          date flag_code
 555900339   20200207  NCHG
 555900339   20200208  NCON     
 555703366   20200207  NCHG      
 555702668   20200208  NCON 

【问题讨论】:

    标签: r filter dplyr tidyverse


    【解决方案1】:

    我觉得有一种更简单的方法,但一种方法是:

    library(dplyr)
    
    df %>%
      mutate(date = as.Date(date, '%Y%m%d')) %>%
      filter(flag_code %in% c('NCHG', 'NCON')) %>%
      filter(
        (c(0, diff(date)) == 1 & ( (flag_code == 'NCON' & lag(flag_code) == 'NCHG') ) | 
           (lead(c(0, diff(date))) == 1 & flag_code == 'NCHG' & lead(flag_code) == 'NCON') ) )
    

    输出:

    # A tibble: 4 x 3
      id        date       flag_code
      <chr>     <date>     <chr>    
    1 555900339 2020-02-07 NCHG     
    2 555900339 2020-02-08 NCON     
    3 555703366 2020-02-07 NCHG     
    4 555702668 2020-02-08 NCON 
    

    【讨论】:

      猜你喜欢
      • 2020-07-24
      • 1970-01-01
      • 2021-01-21
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多