【问题标题】:R Lag Variable And Skip Value BetweenR滞后变量和跳过值之间
【发布时间】:2022-11-24 04:13:01
【问题描述】:
DATA = data.frame(STUDENT = c(1,1,1,2,2,2,3,3,4,4),
                  SCORE = c(6,4,8,10,9,0,2,3,3,7),
                  CLASS = c('A', 'B', 'C', 'A', 'B', 'C', 'B', 'C', 'A', 'B'),
                  WANT = c(NA, NA, 2, NA, NA, -10, NA, NA, NA, NA))

我有 DATA 并希望创建通过以下方式计算的“WANT”:

对于每个学生,找到 SCORE,其中 SCORE 等于 SCORE(CLASS = C) - SCORE(CLASS = A)

EX: SCORE(STUDENT = 1, CLASS = C) - SCORE(STUDENT = 1, CLASS = A) = 8-6=2

【问题讨论】:

    标签: r dplyr case lag


    【解决方案1】:

    尝试

    library(dplyr)
    DATA <- DATA %>%
       group_by(STUDENT) %>% 
       mutate(WANT2 = (SCORE[CLASS == 'C'][1] - SCORE[CLASS == 'A'][1]) * 
           NA^(CLASS != "C")) %>%
       ungroup
    

    -输出

    # A tibble: 10 × 5
       STUDENT SCORE CLASS  WANT WANT2
         <dbl> <dbl> <chr> <dbl> <dbl>
     1       1     6 A        NA    NA
     2       1     4 B        NA    NA
     3       1     8 C         2     2
     4       2    10 A        NA    NA
     5       2     9 B        NA    NA
     6       2     0 C       -10   -10
     7       3     2 B        NA    NA
     8       3     3 C        NA    NA
     9       4     3 A        NA    NA
    10       4     7 B        NA    NA
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2010-11-21
      • 2022-11-23
      • 1970-01-01
      • 2022-06-10
      • 2016-08-18
      • 2021-12-22
      • 2013-08-31
      • 2015-10-29
      相关资源
      最近更新 更多