【问题标题】:How to I get accuracy values by group [duplicate]如何按组获取准确度值[重复]
【发布时间】:2019-10-25 21:36:39
【问题描述】:

我无法获得组图表类型和条件的 Correct_answers 列中的平均准确度(TRUE 值的比例)。

数据

structure(list(Element = structure(c(1L, 1L, 1L, 1L, 1L), .Label = c("1", 
"2", "3", "4", "5", "6"), class = "factor"), Correct_answer = structure(c(2L, 
2L, 2L, 1L, 2L), .Label = c("FALSE", "TRUE"), class = "factor"), 
    Response_time = c(25.155, 6.74, 28.649, 16.112, 105.5906238
    ), Chart_type = structure(c(2L, 2L, 1L, 1L, 1L), .Label = c("Box", 
    "Violin"), class = "factor"), Condition = structure(c(1L, 
    2L, 1L, 2L, 1L), .Label = c("0", "1"), class = "factor")), row.names = c(NA, 
5L), class = "data.frame")

按图表类型平均

av_data_chartType <- data %>% group_by(Chart_type) %>% summarise_each(funs(mean, sd))

按条件平均

av_data_conition <- data %>% group_by(Condition) %>% summarise_each(funs(mean, sd))

没有为准确性而产生的平均值

NA值是准确度应该在的地方。

【问题讨论】:

  • 准确度是什么意思?指标?这是你需要的吗? with(df,table(Correct_answer==T))
  • @NelsonGon “Correct_answer”列具有逻辑向量,如果 Correct_answer 为 TRUE,则表示参与者答对了问题。我想看看有多少比例/百分比的参与者对我提到的小组做出了正确的回答。

标签: r aggregate summary


【解决方案1】:

重现您的代码我收到了一条警告,让我得到了答案:您不应该计算因子变量的统计数据。如果您知道自己在做什么,可以将它们转换为数字:

data <- structure(list(Element = structure(c(1L, 1L, 1L, 1L, 1L), 
                                         .Label = c("1", "2", "3", "4", "5", "6"), 
                                         class = "factor"), 
                     Correct_answer = structure(c(2L, 2L, 2L, 1L, 2L), 
                                                .Label = c("FALSE", "TRUE"), 
                                                class = "factor"), 
                     Response_time = c(25.155, 6.74, 28.649, 16.112, 105.5906238
                     ), 
                     Chart_type = structure(c(2L, 2L, 1L, 1L, 1L), 
                                            .Label = c("Box", 
                                                       "Violin"), 
                                            class = "factor"), 
                     Condition = structure(c(1L, 2L, 1L, 2L, 1L), 
                                           .Label = c("0", "1"), 
                                           class = "factor")),
                row.names = c(NA, 5L), class = "data.frame")

library("dplyr", warn.conflicts = FALSE)
data <- data %>% as_tibble

# av_data_chartType 
data %>% 
        group_by(Chart_type) %>%
        mutate_if(.predicate = is.factor, .funs = as.numeric) %>% 
        summarise_each(list( ~mean, ~sd))
#> `mutate_if()` ignored the following grouping variables:
#> Column `Chart_type`
#> # A tibble: 2 x 9
#>   Chart_type Element_mean Correct_answer_~ Response_time_m~ Condition_mean
#>   <fct>             <dbl>            <dbl>            <dbl>          <dbl>
#> 1 Box                   1             1.67             50.1           1.33
#> 2 Violin                1             2                15.9           1.5 
#> # ... with 4 more variables: Element_sd <dbl>, Correct_answer_sd <dbl>,
#> #   Response_time_sd <dbl>, Condition_sd <dbl>

# av_data_condition
data %>% 
        group_by(Condition) %>%
        mutate_if(.predicate = is.factor, .funs = as.numeric) %>% 
        summarise_each(list( ~mean, ~sd))
#> `mutate_if()` ignored the following grouping variables:
#> Column `Condition`
#> # A tibble: 2 x 9
#>   Condition Element_mean Correct_answer_~ Response_time_m~ Chart_type_mean
#>   <fct>            <dbl>            <dbl>            <dbl>           <dbl>
#> 1 0                    1              2               53.1            1.33
#> 2 1                    1              1.5             11.4            1.5 
#> # ... with 4 more variables: Element_sd <dbl>, Correct_answer_sd <dbl>,
#> #   Response_time_sd <dbl>, Chart_type_sd <dbl>

reprex package (v0.2.1) 于 2019 年 6 月 11 日创建

【讨论】:

  • 谢谢,我现在开始运行了 :)
【解决方案2】:

这应该可行:

a$Correct_answer <- as.logical(a$Correct_answer)

av_data_chartType <- a %>% select(Chart_type, Correct_answer) %>% group_by(Chart_type) %>% summarise_each(funs(mean, sd))

av_data_chartType <- a %>% select(Condition, Correct_answer) %>% group_by(Condition) %>% summarise_each(funs(mean, sd))

你有 2 个问题:

  1. 您的Correct_answer 是一个因素。

  2. 您尝试计算每一列的函数

【讨论】:

    【解决方案3】:

    你可能需要

    library(dplyr)
    
    data %>%
      mutate(Correct_answer = as.logical(Correct_answer)) %>%
      group_by(Chart_type, Condition) %>%
      summarise(avg = mean(Correct_answer))
    

    或者如果您需要单独使用它们

    data %>%
      mutate(Correct_answer = as.logical(Correct_answer)) %>%
      group_by(Chart_type) %>%
      summarise(avg = mean(Correct_answer))
    
    data %>%
      mutate(Correct_answer = as.logical(Correct_answer)) %>%
      group_by(Condition) %>%
      summarise(avg = mean(Correct_answer))
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2016-05-30
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多