为 R 中的每个变量生成和存储频率表答案

【问题标题】：Generate and store freq tables for every variables in R为 R 中的每个变量生成和存储频率表
【发布时间】：2018-08-18 21:00:54
【问题描述】：

我有以下数据集：

set.seed(6)
df <- data.frame(a=floor(runif(100)*5),b=floor(runif(100)*4),c=floor(runif(100)*3))

我想为每个变量生成汇总频率表并将它们存储在一个数据集中。例如。

outexample <- rbind(table(df$a),c(table(df$b),0),c(table(df$c),0,0))
rownames(outexample) <- letters[1:3]
outexample

   0  1  2  3  4
a 19 18 20 18 25
b 30 23 19 28  0
c 28 33 39  0  0

每个变量中有数百个变量和未知数量的类。有没有更体面的方法呢？

【问题讨论】：

标签： r

【解决方案1】：

您可以使用stack() 和table() - 和t() 来获得所需的输出。

t(table(stack(df)))
#   values
#ind  0  1  2  3  4
#  a 19 18 20 18 25
#  b 30 23 19 28  0
#  c 28 33 39  0  0

data.table 就可以了

library(data.table)
setDT(df)
dcast(data = melt(df), variable ~ value)

【讨论】：

【解决方案2】：

这使它变长，然后计数，然后再次变宽

library(magrittr)
df %>% 
  tidyr::gather(variable, score) %>% 
  dplyr::count(variable, score) %>% 
  tidyr::spread(score, n, fill=0)

结果

# A tibble: 3 x 6
  variable   `0`   `1`   `2`   `3`   `4`
  <chr>    <dbl> <dbl> <dbl> <dbl> <dbl>
1 a           19    18    20    18    25
2 b           30    23    19    28     0
3 c           28    33    39     0     0

【讨论】：

【解决方案3】：

我们也可以unlist 数据集并通过复制列名来应用table

table(rep(names(df), each = nrow(df)), unlist(df))

#     0  1  2  3  4    
#  a 19 18 20 18 25
#  b 30 23 19 28  0
#  c 28 33 39  0  0

【讨论】：