【发布时间】:2017-09-21 10:47:40
【问题描述】:
我有以下数据框:
species <- c("a","a","a","b","b","b","c","c","c","d","d","d","e","e","e","f","f","f","g","h","h","h","i","i","i")
category <- c("h","l","m","h","l","m","h","l","m","h","l","m","h","l","m","h","l","m","l","h","l","m","h","l","m")
minus <- c(31,14,260,100,70,200,91,152,842,16,25,75,60,97,300,125,80,701,104,70,7,124,24,47,251)
plus <- c(2,0,5,0,1,1,4,4,30,1,0,0,2,0,5,0,0,3,0,0,0,0,0,0,4)
df <- cbind(species, category, minus, plus)
df<-as.data.frame(df)
我想为每个类别-物种组合做一个 chisq.test,像这样:
物种 a,类别 h 和 l:p 值
物种 a,类别 h 和 m:p 值
物种 a,类别 l 和 m:p 值
物种 b,... 等等
使用以下 chisq.test(虚拟代码):
chisq.test(c(minus(cat1, cat2),plus(cat1, cat2)))$p.value
我想最终得到一个表格,显示每个比较的每个 chisq.test p 值,如下所示:
Species Category1 Category2 p-value
a h l 0.05
a h m 0.2
a l m 0.1
b...
其中 category 和 category 2 是 chisq.test 中比较的类别。
使用 dplyr 可以做到这一点吗?我已经尝试调整 here 和 here 中提到的内容,但正如我所见,它们并不真正适用于这个问题。
编辑:我还想看看如何为以下数据集完成此操作:
species <- c(1:11)
minus <- c(132,78,254,12,45,76,89,90,100,42,120)
plus <- c(1,2,0,0,0,3,2,5,6,4,0)
我想做一个chisq。将表中的每个物种与表中的每个其他物种进行比较(所有物种的每个物种之间的成对比较)。我想得到这样的结果:
species1 species2 p-value
1 2 0.5
1 3 0.7
1 4 0.2
...
11 10 0.02
我尝试将上面的代码更改为以下代码:
species_chisq %>%
do(data_frame(species1 = first(.$species),
species2 = last(.$species),
data = list(matrix(c(.$minus, .$plus), ncol = 2)))) %>%
mutate(chi_test = map(data, chisq.test, correct = FALSE)) %>%
mutate(p.value = map_dbl(chi_test, "p.value")) %>%
ungroup() %>%
select(species1, species2, p.value) %>%
但是,这仅创建了一个表,其中每个物种仅与自身进行比较,而不是与其他物种进行比较。我不太明白在@ycw 给出的原始代码中它指定了比较的位置。
编辑 2:
我通过here找到的代码设法做到了这一点。
【问题讨论】:
标签: r dataframe chi-squared