【问题标题】:tidyverse: Cross tables of one variable with all other variables in data.frametidyverse:一个变量与data.frame中所有其他变量的交叉表
【发布时间】:2019-06-19 23:47:52
【问题描述】:

我想用data.frame中的所有其他变量制作一个变量的交叉表。

library(tidyverse)
library(janitor)

humans <- starwars %>%
  filter(species == "Human")

humans %>%
  janitor::tabyl(gender, eye_color)



gender blue blue-gray brown dark hazel yellow
 female    3         0     5    0     1      0
   male    9         1    12    1     1      2

humans %>%
  dplyr::select_if(is.character) %>%
  dplyr::select(-name, -gender) %>%
  purrr::map(.f = ~janitor::tabyl(dat = humans, gender, .x))

Error: Unknown columns `blond`, `none`, `brown`, `brown, grey`, `brown` and ... 
Call `rlang::last_error()` to see a backtrace

【问题讨论】:

    标签: r tidyverse crosstab purrr janitor


    【解决方案1】:

    假设我们需要带有“性别”的成对表

    humans %>%
      dplyr::select_if(is.character) %>%
      dplyr::select(-name, -gender) %>%
      imap(~ tibble(!! .y := .x) %>% 
                 mutate(gender = humans[['gender']]) %>% 
                 janitor::tabyl(!!rlang::sym(names(.)[1]), gender))
    #$hair_color
    #    hair_color female male
    #        auburn      1    0
    #  auburn, grey      0    1
    # auburn, white      0    1
    #         black      1    7
    #         blond      0    3
    #        brown      6    8
    #  brown, grey      0    1
    #         grey      0    1
    #         none      0    3
    #        white      1    1
    
    #$skin_color
    # skin_color female male
    #       dark      0    4
    #       fair      3   13
    #      light      6    5
    #...
    

    更新

    xtable::xtableList 要求 list 元素之间的名称相同。为此,请将list 元素中的第一列名称更改为相同,然后创建一个标识符列

    library(xtable)
    humans %>%
     dplyr::select_if(is.character) %>%
     dplyr::select(-name, -gender) %>%
     imap(~ tibble(!! .y := .x) %>% 
             mutate(gender = humans[['gender']]) %>% 
             janitor::tabyl(!!rlang::sym(names(.)[1]), gender) %>%  
             mutate(colNname = .y) %>% 
             rename_at(1, ~ 'Variable')) %>%
     xtableList
    

    【讨论】:

    • 感谢@akrun 提供非常有用的答案。但是,无法使用xtable::xtableList.Rnw 中呈现输出。任何想法。
    • @MYaseen208 我认为不常见名称的问题。您可以使名称通用并创建一个新列作为标识符,即humans %&gt;% dplyr::select_if(is.character) %&gt;% dplyr::select(-name, -gender) %&gt;% imap(~ tibble(!! .y := .x) %&gt;% mutate(gender = humans[['gender']]) %&gt;% janitor::tabyl(!!rlang::sym(names(.)[1]), gender) %&gt;% mutate(colNname = .y) %&gt;% rename_at(1, ~ 'Variable')) %&gt;% xtableList
    【解决方案2】:

    仅使用data.table(和一个%&gt;%):

    library(data.table)
    swDT <- data.table(starwars)
    setkey(swDT, gender, hair_color)
    
    
    swDT[species == "Human"
         ][CJ(gender, hair_color, unique =TRUE), .N, .EACHI] %>% 
      dcast(hair_color ~ gender, value.var = "N")
    
    
           hair_color female male
     1:        auburn      1    0
     2:  auburn, grey      0    1
     3: auburn, white      0    1
     4:         black      1    7
     5:         blond      0    3
     6:         brown      6    8
     7:   brown, grey      0    1
     8:          grey      0    1
     9:          none      0    3
    10:         white      1    1
    

    【讨论】:

      【解决方案3】:

      starwars 中的列表列增加了复杂性,但这里有一个带有 mtcars 的示例:交叉表 cyl 针对所有其他变量。

      mtcars %>%
        tidyr::gather(var, value, -cyl) %>%
        janitor::tabyl(cyl, value, var, show_missing_levels = FALSE) %>%
        purrr::map2(.x = ., .y = names(.), ~ janitor::adorn_title(.x, col_name = .y))
      

      返回交叉表列表。 cyl x am, cyl x carb 等:

      $`am`
           am  
       cyl  0 1
         4  3 8
         6  4 3
         8 12 2
      
      $carb
           carb          
       cyl    1 2 3 4 6 8
         4    5 6 0 0 0 0
         6    2 0 0 4 1 0
         8    0 4 3 6 0 1
      
      ...
      

      如果您对这些 data.frames 进行进一步操作,您可能会发现此标题选项更友好:

      purrr::map2(.x = ., .y = names(.), ~ janitor::adorn_title(.x, col_name = .y, placement = "combined"))
      

      这给了你:

      $vs
       cyl/vs  0  1
            4  1 10
            6  3  4
            8 14  0
      

      【讨论】:

        【解决方案4】:

        tably 将名称作为参数,并且您将向量传递给它。

        如果您使用imap,您将可以访问列的名称,您可以将其转换为符号,并且janitor 支持准引用,您可以编写:

        humans %>%
          select_if(is.character) %>%
          select(-name, -gender) %>%
          imap(.f = ~janitor::tabyl(dat = humans, !!sym(.y), gender))
        #$`hair_color`
        #     hair_color female male
        #         auburn      1    0
        #   auburn, grey      0    1
        #  auburn, white      0    1
        #          black      1    7
        #          blond      0    3
        #          brown      6    8
        #    brown, grey      0    1
        #           grey      0    1
        #           none      0    3
        #          white      1    1
        # 
        # $skin_color
        #  skin_color female male
        #        dark      0    4
        #        fair      3   13
        

        有趣的是tabyl.data.frame 调用了一个在符号上工作的未导出函数,因此通过直接调用它,我们可以跳过取消引用并使用基数 R。

        cols <- setdiff(names(Filter(is.character,humans)), c("name","gender"))
        lapply(cols, function(x) janitor:::tabyl_2way(humans, as.name(x), quote(gender)))
        # [[1]]
        #     hair_color female male
        #         auburn      1    0
        #   auburn, grey      0    1
        #  auburn, white      0    1
        #          black      1    7
        #          blond      0    3
        #          brown      6    8
        #    brown, grey      0    1
        #           grey      0    1
        #           none      0    3
        #          white      1    1
        # 
        # [[2]]
        #  skin_color female male
        #        dark      0    4
        

        为了使其与xtable @akrun 的建议一起工作,这里也适用:

        humans %>%
          select_if(is.character) %>%
          select(-name, -gender) %>%
          imap(.f = ~tabyl(dat = humans, !!sym(.y), gender) %>% rename_at(1,~"x")) %>%
          xtableList
        

        cols <- setdiff(names(Filter(is.character,humans)), c("name","gender"))
        l <- lapply(cols, function(x) {
          res <- janitor:::tabyl_2way(humans, as.name(x), quote(gender))
          names(res)[1] <- "x"
          res
        })
        xtableList(l)
        

        【讨论】:

        • 感谢@Moody_Mudskipper 的回答。但是,仍然无法使用 xtable 作为人类 %>% select_if(is.character) %>% select(-name, -gender) %>% imap(.f = ~janitor::tabyl( dat = 人类, !!sym(.y), 性别)) %>% xtableList
        猜你喜欢
        • 1970-01-01
        • 2021-09-09
        • 2021-09-17
        • 2014-09-29
        • 1970-01-01
        • 1970-01-01
        • 2020-08-31
        • 2019-08-16
        • 1970-01-01
        相关资源
        最近更新 更多