【问题标题】:R - Create a column with consecutive numbers and reset based on other column using mutateR - 创建具有连续数字的列并使用 mutate 基于其他列重置
【发布时间】:2020-11-25 13:15:50
【问题描述】:

我正在尝试使用mutate 在我的df 中创建一个新变量game_plus,该变量计算在另一列game 中发生事件的天数并重置每次事件发生。例如,我的df 中的变量game 是二进制的,可以采用"Game""Training"。我正在考虑使用嵌套的ifelse 语句来生成以下输出

game         game_plus
Game             0
Training         1
Training         2
Training         3
Game             0
Training         1
Training         2
Game             0
Training         1
Training         2
Training         3
Training         4

我还想在对面game_minus,它基本上计算事件之前的天数,如下所示。

game         game_plus      game_minus
Game             0              0
Training         1              3
Training         2              2
Training         3              1
Game             0              0
Training         1              2
Training         2              1 
Game             0              0
Training         1              4
Training         2              3
Training         3              2
Training         4              1

有人可以帮忙吗?我知道我可以使用ifelse(game == "Game", 0,,但我正在努力弄清楚如何将 - 事件发生之前或之后- 元素合并到这个 ifelse 语句中。任何帮助将不胜感激!

【问题讨论】:

    标签: r if-statement multiple-columns reset dplyr


    【解决方案1】:
    library(tibble)
    library(dplyr)
    
    game_tbl <- 
    tibble(game = c("Game",rep("Training", 3), "Game",rep("Training", 2),"Game",rep("Training", 4)))
    
    
    game_tbl  %>% 
      mutate(period = cumsum(game == "Game")) %>% ## which rows belong to one game period 
      group_by(period, game) %>% 
      mutate(game_plus = case_when(game == "Game" ~ 0L , TRUE ~ row_number())) %>%
      group_by(period) %>%  
      mutate(units = n() ) %>% ## how many rows per  game period
      mutate(game_minus =  case_when(game == "Game" ~ 0L, TRUE ~  units - row_number() + 1L )) %>%
      ungroup() %>%
      select(game, game_plus, game_minus)
    

    基本上您需要使用 group_byrow_numberrow_number 添加每行的编号。与 group_by 结合使用,它会遍历每个组的行。我添加了辅助变量 period 来识别属于一个游戏周期(从游戏到最后一次训练)的所有行。因此,row_number 将对 game == "Game" 和 game == "Training" 的每个游戏周期计算行数。我还添加了辅助变量 units 来计算每个游戏周期的行数。

    【讨论】:

      【解决方案2】:

      这是一个使用data.table 的解决方案,这里的要点是使用n_games 将数据分解为多个部分。然后data.table 有一个内置的方法来基本上使用.I 获取行号,因此如果我们将数据分解为每个部分,我们可以反向获取行号以倒计时。剩下的最后一件事是为每个比赛日分配一个值 0,而不是距离下一场比赛的天数。

      library(data.table)
      
      dt = data.table(game = c("Game",rep("Training", 3), "Game",rep("Training", 2),"Game",rep("Training", 4)))
      ## Create an indicator for if a day has a game
      dt[,game_ind := ifelse(game == 'Game', 1, 0)]
      ## Use the indicators to break up the data into groups by taking the cumulative sum of games
      dt[,n_games := cumsum(game_ind)]
      ## .SD[,.I] gets the row number for each group of n_games, rev makes it so that it's 
      ## counting down instead of up
      dt[,game_minus := rev(unlist(.SD[,.I])), by = n_games]
      ## Set game days to 0
      dt[game == 'Game', game_minus := 0]
      
      dt
      #>         game game_ind n_games game_minus
      #>  1:     Game        1       1          0
      #>  2: Training        0       1          3
      #>  3: Training        0       1          2
      #>  4: Training        0       1          1
      #>  5:     Game        1       2          0
      #>  6: Training        0       2          2
      #>  7: Training        0       2          1
      #>  8:     Game        1       3          0
      #>  9: Training        0       3          4
      #> 10: Training        0       3          3
      #> 11: Training        0       3          2
      #> 12: Training        0       3          1
      
      ## If you want to clean up
      dt[,c('game_ind', 'n_games') := NULL]
      
      head(dt)
      #>        game game_minus
      #> 1:     Game          0
      #> 2: Training          3
      #> 3: Training          2
      #> 4: Training          1
      #> 5:     Game          0
      #> 6: Training          2
      

      reprex package (v0.3.0) 于 2020 年 11 月 25 日创建

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2015-08-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2019-03-31
        • 1970-01-01
        相关资源
        最近更新 更多