【问题标题】:subtracting rows nested in a data.frame based on column value corrospondence根据列值对应关系减去嵌套在 data.frame 中的行
【发布时间】:2021-11-07 08:47:43
【问题描述】:

对于study 的每个唯一值,我想知道如何减去每个interval_id 上具有group == "C" 的行的yi 来自他们对应的yi 行有group != "C"?

例如,在study == 1 中,interval_id == 0 上的group == "C"yi == .4 应从yi == .1 中减去group == "T1" 上的group == "T1"

同样,在study == 1 中,group == "C" 上的interval_id == 1 中的yi == .5 应该从yi == .3 中减去group == "T1" 上的interval_id == 1

最终输出应该是删除了group == C行的data.frame(如下)。

m = "
study group  yi  vi interval_id obs
1      T1    .1  1  0           1
1      T1    .3  2  1           2
1      C     .4  3  0           3
1      C     .5  4  1           4
2      T2    .6  5  0           5
2      C     .9  6  1           6
"

data <- read.table(text=m,h=T)

# DESIRED OUTPUT:
"
study group  yi  vi interval_id obs
1      T1    -.3  .  0           1
1      T1    -.2  .  1           2
2      T2    -.3  .  0           5
2      C      .9  .  1           6
"

【问题讨论】:

    标签: r dataframe function dplyr tidyverse


    【解决方案1】:

    我们可以在每个study 中减去yi 值,其中group != 'C'yi 值,其中group = 'C'。最后,删除group != 'C'所在的行。

    library(dplyr)
    
    data %>%
      group_by(study) %>%
      mutate(yi = rep(yi[group != 'C'] - yi[group == 'C'], 2)) %>%
      ungroup() %>%
      filter(group != 'C')
    
    #  study group    yi    vi interval_id   obs
    #  <int> <chr> <dbl> <int>       <int> <int>
    #1     1 T1     -0.3     1           0     1
    #2     1 T1     -0.2     2           1     2
    #3     2 T2     -0.3     5           0     5 
    

    【讨论】:

    • 是的,我们可以在mutate中添加vi = rep(vi[group != 'C'] + vi[group == 'C'], 2)
    【解决方案2】:

    我们可以filter 数据,做一个连接和做减法

    library(dplyr)
    library(data.table)
    data %>%
        filter(group == 'C') %>% 
        select(study, yi2= yi) %>%
        mutate(rn = rowid(study)) %>% 
        right_join(data %>% 
             filter(group != 'C') %>%
             mutate(rn = rowid(study))) %>%
        mutate(study, group, yi = yi- yi2, yi2 = NULL)
    

    -输出

     study rn group   yi vi interval_id obs
    1     1  1    T1 -0.3  1           0   1
    2     1  2    T1 -0.2  2           1   2
    3     2  1    T2 -0.3  5           0   5
    

    或者我们可以重塑为“宽”格式,然后进行减法

    library(tidyr)
    data %>%
        mutate(new = c('NotC', 'C')[1 + (group == 'C')], 
          rn = rowid(study, new)) %>% 
       select(study, rn, new, yi) %>%
       pivot_wider(names_from = new, values_from = yi) %>% 
       transmute(yi = NotC - C) %>% 
       pull(yi) %>%
       mutate(data %>% 
          filter(group != 'C'), yi = .)
    

    -输出

     study group   yi vi interval_id obs
    1     1    T1 -0.3  1           0   1
    2     1    T1 -0.2  2           1   2
    3     2    T2 -0.3  5           0   5
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2018-10-07
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2020-01-12
      相关资源
      最近更新 更多