【发布时间】:2018-11-29 05:56:50
【问题描述】:
我有以下数据框:
df2 <-
structure(list(A = c(4, 5, 3, 3, 4, 4, 4, 5, 5, 4),
B = c(4, 5, 4, 4, 4, 4, 3, 5, 5, 4),
C = c(4, 5, 3, 4, 2, 4, 2, 5, 5, 4),
D = c(4, 5, 0, 0, 1, 4, 0, 0, 0, 0),
E = c(4, 5, 4, 4, 4, 4, 2, 5, 5, 5),
F = c(5, 5, 4, 4, 4, 4, 2, 5, 4),
G = c(5, 5, 4, 4, 2, 4, 2, 5, 5, 5),
H = c(5, 5, 4, 4, 3, 4, 3, 5, 5, 4),
K = c(5, 5, 4, 4, 3, 4, 2, 5, 5, 5),
L = c(5, 5, 4, 4, 3, 4, 2, 5, 5, 5)),
.Names = c("A", "B", "C", "D", "E", "F", "G", "H", "K", "L"),
row.names = c(NA, -10L),
class = c("tbl_df", "tbl", "data.frame"))
但不知何故,当我这样做时,不考虑“NA”:
library(dplyr)
library(tidyr)
df2 %>% gather(Type) %>% group_by(Type) %>% summarise_all(funs(mean(., na.rm = TRUE), sd(., na.rm = TRUE), n(),n1 = sum(!is.na(.)), n2 = sum(is.na(.))))
不考虑 NA 的结果:
“n()”、sum(!is.na(.) 或 sum(is.na(.)) 都没有得到正确的结果(我知道最后两个是相反的,只是为了确定.
【问题讨论】:
-
看来“均值”也不考虑NA,我手工计算了“F”列,总和是37,除以9应该是4.111,但计算结果是3.7跨度>
-
尝试在
structure()中像这样定义F:F = c(5, 5, 4, 4, 4, 4, 2, 5, 4, NA) -
如果你的列长不相等,就像@ANG 提到的那样,就会发生奇怪的事情。
-
对我来说似乎是一个错误。它无法正确计算行尺寸,但可以很好地计算列尺寸。尝试使用:
colMeans(df2[1:10,])和colMeans(df2[,1:10])