【问题标题】:在条件下替换 NA
【发布时间】:2022-01-23 10:28:29
【问题描述】:

我需要您整理的数据框。下面提供了一个数据样本:

> dput(data_1)
structure(list(subject = c("E1", "E1", "E1", "E1", "E1", "E1", 
"E1", "E1"), block = c(3, 3, 4, 4, 5, 5, 6, 6), condition = c("EI", 
"I", "EI", "I", "EI", "I", "EI", "I"), prev_total_RT = c("963", 
"NA", "963", "NA", "963", "NA", "963", "NA"), total_RT = c(271, 
1042, 409, 406, 544, 490, 645, 465), Item_number = c(17, 46, 
17, 46, 17, 46, 17, 46)), row.names = c(NA, -8L), class = c("tbl_df", 
"tbl", "data.frame"))
> data_1
# A tibble: 8 x 6
  subject block condition prev_total_RT total_RT Item_number
  <chr>   <dbl> <chr>     <chr>            <dbl>       <dbl>
1 E1          3 EI        963                271          17
2 E1          3 I         NA                1042          46
3 E1          4 EI        963                409          17
4 E1          4 I         NA                 406          46
5 E1          5 EI        963                544          17
6 E1          5 I         NA                 490          46
7 E1          6 EI        963                645          17
8 E1          6 I         NA                 465          46

虽然提供了条件“EI”的“prev_total_RT”值,但不提供条件“I”的值。我需要一个可以为条件“I”生成“prev_total_RT”值的代码。

条件“I”的“prev_total_RT”值应该是“block”= 3、4和5中条件“I”的“total_RT”之和。这应该由每个“主题”和“项目编号”。例如,对于条件“I”中的主题“E1”和Item_number“46”,“prev_total_RT”的值应该是“block”3、4、5中“total_RT”值的总和:1042 + 406 + 490 = 1938 年。

下面提供了所需的输出:

> dput(data_2)
structure(list(subject = c("E1", "E1", "E1", "E1", "E1", "E1", 
"E1", "E1"), block = c(3, 3, 4, 4, 5, 5, 6, 6), condition = c("EI", 
"I", "EI", "I", "EI", "I", "EI", "I"), prev_total_RT = c(963, 
1938, 963, 1938, 963, 1938, 963, 1938), total_RT = c(271, 1042, 
409, 406, 544, 490, 645, 465), Item_number = c(17, 46, 17, 46, 
17, 46, 17, 46)), row.names = c(NA, -8L), class = c("tbl_df", 
"tbl", "data.frame"))

> data_2
# A tibble: 8 x 6
  subject block condition prev_total_RT total_RT Item_number
  <chr>   <dbl> <chr>             <dbl>    <dbl>       <dbl>
1 E1          3 EI                  963      271          17
2 E1          3 I                  1938     1042          46
3 E1          4 EI                  963      409          17
4 E1          4 I                  1938      406          46
5 E1          5 EI                  963      544          17
6 E1          5 I                  1938      490          46
7 E1          6 EI                  963      645          17
8 E1          6 I                  1938      465          46

对此的任何帮助将不胜感激。

【问题讨论】:

    标签: r dplyr


    【解决方案1】:

    直接的方法,

    library(dplyr)
    
    df %>% 
     group_by(subject, Item_number) %>% 
     mutate(prev_total_RT = replace(prev_total_RT, condition == 'I', sum(total_RT[block %in% c(3, 4, 5)])))
    
    # subject block condition prev_total_RT total_RT Item_number
    #  <chr>   <dbl> <chr>     <chr>            <dbl>       <dbl>
    #1 E1          3 EI        963                271          17
    #2 E1          3 I         1938              1042          46
    #3 E1          4 EI        963                409          17
    #4 E1          4 I         1938               406          46
    #5 E1          5 EI        963                544          17
    #6 E1          5 I         1938               490          46
    #7 E1          6 EI        963                645          17
    #8 E1          6 I         1938               465          46
    

    【讨论】:

    • 这非常简单、聪明、高效,非常感谢 Sotos 的帮助和时间。
    猜你喜欢
    • 1970-01-01
    • 2023-02-13
    • 2016-08-29
    • 1970-01-01
    • 1970-01-01
    • 2023-03-21
    • 1970-01-01
    • 2014-11-30
    • 2018-09-08
    相关资源
    最近更新 更多