【问题标题】:Mutate multiple cumsum in dplyr在 dplyr 中改变多个 cumsum
【发布时间】:2021-06-21 04:40:15
【问题描述】:

我正在尝试使用mutate 开发一个 cumsum。挑战在于我有 10 个专栏要做,而且我知道如何一一做。有没有办法让我做类似mutate(across(all_of(c(3:4)), ~cumsum(c(3:4))) 的事情?

cat %>% 
  group_by(animals) %>%
  mutate(weight1 = cumsum(weight1),
         weight2 = cumsum(weight2))
structure(list(animals = c("E1", "E1", "E1", 
"E2", "E2", "E2"), period = structure(c(18690, 
18697, 18704, 18690, 18697, 18704), class = "Date"), weight1 = c(704, 
734, 653, 851, 911, 829), weight2 = c(0, 235, 325, 0, 148, 
200)), row.names = c(NA, -6L), class = c("data.table", "data.frame")) 

预期输出:

  animals period     weight1 weight2
  <chr>   <date>       <dbl>   <dbl>
1 E1      2021-03-04     704       0
2 E1      2021-03-11    1438     235
3 E1      2021-03-18    2091     560
4 E2      2021-03-04     851       0
5 E2      2021-03-11    1762     148
6 E2      2021-03-18    2591     348

【问题讨论】:

    标签: r dplyr cumsum


    【解决方案1】:

    尝试这样做

    df <- structure(list(animals = c("E1", "E1", "E1", 
                               "E2", "E2", "E2"), period = structure(c(18690, 
                                                                       18697, 18704, 18690, 18697, 18704), class = "Date"), weight1 = c(704, 
                                                                                                                                        734, 653, 851, 911, 829), weight2 = c(0, 235, 325, 0, 148, 
                                                                                                                                                                              200)), row.names = c(NA, -6L), class = c("data.table", "data.frame")) 
    
    library(dplyr)
    
    df %>% 
      group_by(animals) %>% 
      mutate(across(starts_with("weight"), cumsum))
    #> # A tibble: 6 x 4
    #> # Groups:   animals [2]
    #>   animals period     weight1 weight2
    #>   <chr>   <date>       <dbl>   <dbl>
    #> 1 E1      2021-03-04     704       0
    #> 2 E1      2021-03-11    1438     235
    #> 3 E1      2021-03-18    2091     560
    #> 4 E2      2021-03-04     851       0
    #> 5 E2      2021-03-11    1762     148
    #> 6 E2      2021-03-18    2591     348
    

    reprex package (v1.0.0) 于 2021-03-24 创建

    vars &lt;- names(df)[3:4]

    df %&gt;% group_by(animals) %&gt;% mutate(across(all_of(vars), cumsum))

    【讨论】:

    • 谢谢,我喜欢all_of 方法。
    【解决方案2】:

    您尝试执行的操作会出错。一旦你group_by(animals)mutate 只能操作三列。所以你可以使用:

    cat %>% 
      group_by(animals) %>%
      mutate(across(2:3, cumsum))
    # A tibble: 6 x 4
    # Groups:   animals [2]
      animals period     weight1 weight2
      <chr>   <date>       <dbl>   <dbl>
    1 E1      2021-03-04     704       0
    2 E1      2021-03-11    1438     235
    3 E1      2021-03-18    2091     560
    4 E2      2021-03-04     851       0
    5 E2      2021-03-11    1762     148
    6 E2      2021-03-18    2591     348
    

    但这种方法需要您知道新索引是什么。最好以编程方式尝试一些东西。如果所有列都是权重,您可以使用:

    cat %>% 
      group_by(animals) %>%
      mutate(across(starts_with("weight"), cumsum))
    

    或者如果您只想对所有数字列进行操作:

    cat %>% 
      group_by(animals) %>%
      mutate(across(where(is.numeric), cumsum))
    

    后两种方法都可以提供您想要的输出。

    【讨论】:

      【解决方案3】:

      基础 R 解决方案:

      num_col_idx <- vapply(df, is.numeric, logical(1))
      
      cbind(df[,!num_col_idx],
            data.frame(do.call(rbind, lapply(
              split(df[, num_col_idx], df$animals), cumsum)), row.names = NULL))
      

      【讨论】:

        猜你喜欢
        • 2017-06-02
        • 1970-01-01
        • 2015-02-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2015-02-24
        相关资源
        最近更新 更多