【问题标题】:Substract one row from another with conditions有条件的从另一行中减去一行
【发布时间】:2021-03-09 12:07:42
【问题描述】:

我有一张这样的桌子:

  treatment individual phase dist_mean  track
1   control          1   pre     13.33 569.99
2   control          1  post     10.95 624.65
3   control          2   pre      9.93 363.35
4   control          2  post     10.11 339.88
5   control          3   pre     12.00 676.42
6   control          3  post     12.80 939.15

原则上,两行总是成对的。我需要从样本的前阶段减去后阶段的dist_mean。简单的方法是从 1 中减去第 2 行,依此类推。但是考虑到这个顺序在任何时候都可能被打乱,整个计算就会出错。这就是为什么我想在两个阶段的治疗和个体匹配的条件下进行计算。 信息:治疗发生变化。并不总是control

【问题讨论】:

    标签: r row subtraction


    【解决方案1】:

    data.table 选项

    setDT(df)[
      order(treatment, individual, phase)
    ][
      ,
      setNames(lapply(.SD, diff), paste0("diff_", names(.SD))),
      by = .(treatment, individual),
      .SDcols = c("dist_mean", "track")
    ]
    

    给予

       treatment individual diff_dist_mean diff_track
    1:   control          1           2.38     -54.66
    2:   control          2          -0.18      23.47
    3:   control          3          -0.80    -262.73
    

    使用reshape 的基本 R 选项

    transform(
      reshape(
        df,
        direction = "wide",
        idvar = c("treatment", "individual"),
        timevar = "phase"
      ),
      diff_dist_mean = dist_mean.pre - dist_mean.post,
      diff_track = track.pre - track.post
    )
    

    给予

      treatment individual dist_mean.pre track.pre dist_mean.post track.post
    1   control          1         13.33    569.99          10.95     624.65
    3   control          2          9.93    363.35          10.11     339.88
    5   control          3         12.00    676.42          12.80     939.15
      diff_dist_mean diff_track
    1           2.38     -54.66
    3          -0.18      23.47
    5          -0.80    -262.73
    

    【讨论】:

      【解决方案2】:

      使用aggregate:

      aggregate(dist_mean ~ treatment + individual, df1, function(x) diff(rev(x)))
      #  treatment individual dist_mean
      #1   control          1      2.38
      #2   control          2     -0.18
      #3   control          3     -0.80
      

      数据

      df1 <- read.table(text = "
        treatment individual phase dist_mean  track
      1   control          1   pre     13.33 569.99
      2   control          1  post     10.95 624.65
      3   control          2   pre      9.93 363.35
      4   control          2  post     10.11 339.88
      5   control          3   pre     12.00 676.42
      6   control          3  post     12.80 939.15
      ", header = TRUE)
      

      【讨论】:

        【解决方案3】:
        df <- read.table(text = "  treatment individual phase dist_mean  track
        1   control          1   pre     13.33 569.99
        2   control          1  post     10.95 624.65
        3   control          2   pre      9.93 363.35
        4   control          2  post     10.11 339.88
        5   control          3   pre     12.00 676.42
        6   control          3  post     12.80 939.15", header = T)
        
        library(tidyverse)
        df %>% 
          pivot_wider(c(treatment, individual), names_from = phase, values_from = dist_mean) %>% 
          mutate(d = post - pre)
        #> # A tibble: 3 x 5
        #>   treatment individual   pre  post      d
        #>   <chr>          <int> <dbl> <dbl>  <dbl>
        #> 1 control            1 13.3   11.0 -2.38 
        #> 2 control            2  9.93  10.1  0.180
        #> 3 control            3 12     12.8  0.8
        

        reprex package (v1.0.0) 于 2021-03-09 创建

        data.table

        df <- read.table(text = "  treatment individual phase dist_mean  track
        1   control          1   pre     13.33 569.99
        2   control          1  post     10.95 624.65
        3   control          2   pre      9.93 363.35
        4   control          2  post     10.11 339.88
        5   control          3   pre     12.00 676.42
        6   control          3  post     12.80 939.15", header = T)
        library(data.table)
        setDT(df)
        res <- dcast(data = df, formula = treatment + individual ~ phase, value.var = "dist_mean")[, d := post - pre]
        head(res)
        #>    treatment individual  post   pre     d
        #> 1:   control          1 10.95 13.33 -2.38
        #> 2:   control          2 10.11  9.93  0.18
        #> 3:   control          3 12.80 12.00  0.80
        

        reprex package (v1.0.0) 于 2021-03-09 创建

        【讨论】:

          【解决方案4】:

          使用 data.table,重塑 long-to-wide,然后在 post/pre 列中获取差异:

          library(data.table)
          
          setDT(df1)
          
          dcast(df1, treatment + individual ~ phase, value.var = c("dist_mean", "track")
                )[, .(treatment, individual,
                      diff_dist_mean = dist_mean_post - dist_mean_pre,
                      diff_track = track_post - track_pre)]
          #    treatment individual diff_dist_mean diff_track
          # 1:   control          1          -2.38      54.66
          # 2:   control          2           0.18     -23.47
          # 3:   control          3           0.80     262.73
          

          【讨论】:

            猜你喜欢
            • 1970-01-01
            • 1970-01-01
            • 1970-01-01
            • 2010-10-29
            • 1970-01-01
            • 2021-11-04
            • 2022-01-22
            • 1970-01-01
            • 1970-01-01
            相关资源
            最近更新 更多