【发布时间】:2018-01-07 22:39:34
【问题描述】:
library(tidyverse)
library(purrr)
使用下面的示例数据,我可以创建以下函数:
Funs <- function(DF, One, Two){
One <- enquo(One)
Two <- enquo(Two)
DF %>% filter(School == (!!One) & Code == (!!Two)) %>%
group_by(Code, School) %>%
summarise(Count = sum(Question1))
}
然后我可以使用该函数过滤两个变量 - 学校和代码 - 如下所示:
Funs(DF, "School1", "B344")
这很好,但是我的实际数据有很多变量,因此我不想在函数中不断输入“School”和“Code”变量,我想使用 tidyverse 和 purrr 包来循环两个列表(学校之一,代码之一)并将其输入过滤器。我希望输出是结果列表。
为简单起见,输入 dplyr::filter 的两个列表各只有两个值:School2 将使用 S300,School1 将使用 B344,就像上面的示例一样。
我尝试过的一些例子:
map2(c(“School2”, ”School1”),
c(“S300”, ”B344”),
function(x,y) {
DF %>% filter(School == .x & Code == .y) %>%
group_by(Code, School) %>%
summarise(Count = sum(Question1))
}
还有……
map2(c("School2", "School1")),
c("S300","B344"),
~filter(School == .x & Code == .y) %>%
group_by(Code, School)%>%
summarise(Count = sum(Question1))
还有这个……
list(c("School2", "School1"), c("S300", "B344")) %>%
map2( ~ filter(School == .x & Code == .y) %>%
group_by(Code, School) %>%
summarise(Count = sum(Question1)))
这些似乎都不起作用,因此我们将不胜感激!
样本数据:
Code <- c("B344","B555","S300","T220","B888","B888","B555","B344","B344","T220","B555","B555","S300","B555","S300","S300","S300","S300","B344","B344","B888","B888","B888")
School <- c("School1","School1","School2","School3","School4","School4","School1","School1","School3","School3","School4","School1","School1","School3","School2","School2","School4","School2","School3","School4","School3","School1","School2")
Question1 <- c(3,4,5,4,5,5,5,4,5,3,4,5,4,5,4,3,3,3,4,5,4,3,3)
Question2 <- c(5,4,3,4,3,5,4,3,2,3,4,5,4,5,4,3,4,4,5,4,3,3,4)
DF <- data_frame(Code, School, Question1, Question2)
【问题讨论】:
-
你可以做类似
map2(c("School2", "School1"), c("S300", "B344"), ~DF %>% filter(School == .x, Code == .y) %>% group_by(Code, School) %>% summarise(Count = sum(Question1)))的事情,但这似乎真的毫无意义;像DF %>% filter(paste(School, Code) %in% paste(c("School2", "School1"), c("S300", "B344"))) %>% group_by(Code, School) %>% summarise(Count = sum(Question1))这样的操作更容易 -
看起来你的第一个建议就是我想要的。我现在意识到使用 map2_df 可能更好。另外,我最好创建两个列表,例如 list1%filter(School==.x,Code==.y)%>%group_by(Code,School)%>%summarise(Count=总和(问题1)))
-
如果您将其作为官方答案发布,我可以接受您的第一个回复。另外,我很想知道为什么您认为第二个建议更好?输出可能更好,但我可以使用 map2_df...
标签: r dplyr tidyverse purrr rlang