【发布时间】:2021-08-14 15:34:34
【问题描述】:
我需要对多个列联表运行卡方检验并将它们存储到数据框中。我曾想过使用tabyl 和chisq.test 函数。我的原始数据集由患者症状报告组成。
一个虚构的例子:
数据
df <- structure(list(Race = c("White", "Asian", "White", "Asian", "Black",
"Asian", "Black", "White"), Headache = c("No", "No", "Yes", "Yes", "No",
"No", "Yes", "Yes"), Paraesthesias = c("No", "Yes", "Yes", "Yes", "Yes", "No", "No", "No"
), Heartburn = c("Yes", "No", "No", "Yes", "No", "Yes", "Yes", "Yes")), row.names = c(NA,
-8L), class = "data.frame")
print(df)
通过蛮力获得期望的结果
headache_p <- chisq.test(df[c(1,2)] %>% tabyl(Race, Headache))$p.value
paraesthesias_p <- chisq.test(df[c(1,3)] %>% tabyl(Race, Paraesthesias))$p.value
heartburn_p <- chisq.test(df[c(1,4)] %>% tabyl(Race, Heartburn))$p.value
data.frame("Headache" = headache_p, "Paraesthesias" = paraesthesias_p, "Heartburn" = heartburn_p, row.names = "p.value")
尝试使用循环获得期望的结果
y <- list()
for (i in 2:4) {
z <- chisq.test(df[c(1, i)] %>% tabyl(Race, colnames(df[i]), show_na = FALSE))
y <- c(y, z)
}
setNames(data.frame(y, row.names = "p.value"), colnames(df)[-1])
错误信息
Error: Can't extract columns that don't exist.
x Column `Headache` doesn't exist.
Run `rlang::last_error()` to see where the error occurred.
问题
如何为这个过程创建一个 for 循环?我的原始数据集有 60 多个症状,因此需要一个循环。我不知道如何将列名放入管道中,因为它将其视为字符而不是对象。
【问题讨论】:
标签: r for-loop binary-data janitor