【问题标题】:Calculating confidence interval for group proportions in dplyr计算 dplyr 中组比例的置信区间
【发布时间】:2020-12-30 21:27:53
【问题描述】:

我想计算 dplyr 中组比例的置信区间。我尝试了一些基于此website 的代码,但我无法使其工作。

样本数据:

structure(list(sect = structure(c(5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L), .Label = c("Maronite", "Orthodox", "Catholic", "Armenian", 
"Sunni", "Shia", "Druze", "Just a Muslim", "Other", "Don't know"
), class = "factor"), social_trust = structure(c(1L, 1L, 1L, 
1L, 1L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L), .Label = c("I must be very careful in dealing with people", 
"Most people can be trusted", "Don't know"), class = "factor")), row.names = c(NA, 
200L), class = "data.frame")

我使用此代码获取此表:

lebanon %>%
  filter(!is.na(social_trust), !is.na(sect), sect != "Armenian", sect != "Just a Muslim",
         sect != "Other") %>%
  group_by(sect) %>%
  count(social_trust) %>%
  mutate(prop = n / sum(n))


sect     social_trust                                      n    prop
   <fct>    <fct>                                         <int>   <dbl>
 1 Maronite I must be very careful in dealing with people   613 0.968  
 2 Maronite Most people can be trusted                       19 0.0300 
 3 Maronite Don't know                                        1 0.00158
 4 Orthodox I must be very careful in dealing with people   152 0.944  
 5 Orthodox Most people can be trusted                        6 0.0373 
 6 Orthodox Don't know                                        3 0.0186 
 7 Catholic I must be very careful in dealing with people   107 0.915  
 8 Catholic Most people can be trusted                        9 0.0769 
 9 Catholic Don't know                                        1 0.00855
10 Sunni    I must be very careful in dealing with people   639 0.980  
11 Sunni    Most people can be trusted                        3 0.00460
12 Sunni    Don't know                                       10 0.0153 
13 Shia     I must be very careful in dealing with people   549 0.918  
14 Shia     Most people can be trusted                       32 0.0535 
15 Shia     Don't know                                       17 0.0284 
16 Druze    I must be very careful in dealing with people   175 0.921  
17 Druze    Most people can be trusted                       15 0.0789

理想情况下,我希望为每个组设置上下置信区间,并将其绑定到 prop 列旁边。有什么想法吗?

【问题讨论】:

    标签: r dplyr


    【解决方案1】:

    您可以在lapply 中使用prop.test mutate

    lebanon %>%
      filter(!is.na(social_trust), 
             !is.na(sect), 
             sect != "Armenian", 
             sect != "Just a Muslim",
             sect != "Other") %>%
      group_by(sect) %>%
      count(social_trust) %>% 
      mutate(prop = n / sum(n), 
             lower = lapply(n, prop.test, n = sum(n)), 
             upper = sapply(lower, function(x) x$conf.int[2]), 
             lower = sapply(lower, function(x) x$conf.int[1]))
    
    #># A tibble: 2 x 6
    #># Groups:   sect [1]
    #>  sect  social_trust                                      n  prop   lower  upper
    #>  <fct> <fct>                                         <int> <dbl>   <dbl>  <dbl>
    #>1 Sunni I must be very careful in dealing with people   198  0.99 0.961   0.998 
    #>2 Sunni Don't know                                        2  0.01 0.00173 0.0395
    
    

    【讨论】:

    • 谢谢,这正是我想要的。 :)
    猜你喜欢
    • 1970-01-01
    • 2021-04-12
    • 2020-05-08
    • 1970-01-01
    • 2019-01-18
    • 2020-09-01
    • 1970-01-01
    • 2019-03-12
    • 2012-12-16
    相关资源
    最近更新 更多