【问题标题】:convert one variable into two variables by multiple conditions in R通过R中的多个条件将一个变量转换为两个变量
【发布时间】:2021-03-02 08:52:14
【问题描述】:

我的数据框如下所示:

df:
  patID   D1   D2   D3  Code Day Month Year
1    11 V581   G8 V581  5468   9     7 2017
2    12 V581 V588 V588 0905F   8     8 2017
3    13 V582 B678 B678 J8907   9     5 2016
4    14  G67 A568 A568 J8907   7     9 2016
5    15 V583 V584 V584 J9056   8     5 2016
6    16 V584 V586 V586 G0890   5     4 2017
7    17  G69 G789 A567 A0891   4    10 2017

如果从 D1、D2、D3 组合匹配以“V58”开头,我必须处理这些情况。我还需要从 Code 中获取值,但根据特定条件将它们拆分为变量,即 CodeA if ("J8907","J9056") 和 CodeB if ("0905F", "G0890")。

输出应如下所示:

  patID Diags CodeA CodeB     Date
1    11  V581  <NA>  <NA> 7/9/2017
2    12  V581 0905F  <NA> 8/8/2017
3    13  V582  <NA> J8907 5/9/2016
4    14  <NA>  <NA> J8907 9/7/2016
5    15  V583  <NA> J9056 5/8/2016
6    16  V584 G0890  <NA> 4/5/2017
7    12  V588 0905F  <NA> 8/8/2017
8    13  <NA>  <NA> J8907 5/9/2016
9    14  <NA>  <NA> J8907 9/7/2016
10   15  V584  <NA> J9056 5/8/2016
11   16  V586 G0890  <NA> 4/5/2017
12   11  V581  <NA>  <NA> 7/9/2017
13   12  V588 0905F  <NA> 8/8/2017
14   13  <NA> J8907 J8907 5/9/2016
15   14  <NA>  <NA> J8907 9/7/2016
16   15  V584  <NA> J9056 5/8/2016
17   16  V586 G0890  <NA> 4/5/2017

我试过了,但这不起作用(我采用了这种长格式方法,因为在原始数据中我有类似的变量集,如 D1:D3 并且需要类似地处理)

datnew <- bind_rows(
  dat %>% select(patID,Code:Year,ends_with("1")) %>% rename_all(str_remove, "1"),
  dat %>% select(patID,Code:Year,ends_with("2")) %>% rename_all(str_remove, "2"),
  dat %>% select(patID,Code:Year,ends_with("3")) %>% rename_all(str_remove, "3")
) %>%
  transmute(
    patID,
    diags = D,
    CodeA =Code,
    CodeB = Code,
    Dates = as.Date(paste0(Day,"/",Month, "/", Year),format="%d/%m/%Y")
  )%>%
  filter(
     grepl("V58",diags)|
     CodeA  %in%    c("0905F","G0890")|
     CodeB %in% c("J8907","J9056")
  )

Thanks in Advance!!

【问题讨论】:

    标签: r filter dplyr tidyverse


    【解决方案1】:

    你可以试试:

    library(dplyr)
    library(tidyr)
    
    df %>%
      unite(Date, Day, Month, Year, sep = '/') %>%
      pivot_longer(cols = D1:D3, values_to = 'Diags') %>%
      mutate(Diags = replace(Diags, -grep('V58', Diags), NA), 
             CodeA = ifelse(Code %in% c("J8907","J9056"), Code, NA),
             CodeB = ifelse(Code %in% c("0905F", "G0890"), Code, NA)) %>%
      filter(!(is.na(Diags) & is.na(CodeA) & is.na(CodeB))) %>%
      select(-Code, -name)
    
    # A tibble: 17 x 5
    #   patID Date     Diags CodeA CodeB
    #   <int> <chr>    <chr> <chr> <chr>
    # 1    11 9/7/2017 V581  NA    NA   
    # 2    11 9/7/2017 V581  NA    NA   
    # 3    12 8/8/2017 V581  NA    0905F
    # 4    12 8/8/2017 V588  NA    0905F
    # 5    12 8/8/2017 V588  NA    0905F
    # 6    13 9/5/2016 V582  J8907 NA   
    # 7    13 9/5/2016 NA    J8907 NA   
    # 8    13 9/5/2016 NA    J8907 NA   
    # 9    14 7/9/2016 NA    J8907 NA   
    #10    14 7/9/2016 NA    J8907 NA   
    #11    14 7/9/2016 NA    J8907 NA   
    #12    15 8/5/2016 V583  J9056 NA   
    #13    15 8/5/2016 V584  J9056 NA   
    #14    15 8/5/2016 V584  J9056 NA   
    #15    16 5/4/2017 V584  NA    G0890
    #16    16 5/4/2017 V586  NA    G0890
    #17    16 5/4/2017 V586  NA    G0890
    

    【讨论】:

      猜你喜欢
      • 2020-11-27
      • 2018-05-13
      • 2018-03-22
      • 2020-08-02
      • 2022-11-30
      • 1970-01-01
      • 2016-10-21
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多