【发布时间】:2016-11-28 04:12:42
【问题描述】:
我有下面两个简单的数据框。我想使用 dplyr 和 tidyverse 来查找第二个数据帧(Df2)的“Task2”中不在第一个数据帧(Df)的“Task”中的类别。我想为此使用 dplyr 的“setdiff”功能。另外,我想保留第二个数据帧(Df2)的“时间”列中的相应时间。
因此,最终产品应包括两行,一排为客户“Chris”的“Iron shirt”,总时间为 30 次,一排为客户“Eric”,带有“Buy groceries”,以及对应的时间为 8。
我还想删除日期列。
我在想一种方法是使用 dplyr 的“setdiff”函数(我意识到必须更改 Task 和 Task2 列名以便它们匹配)将两行分开,然后重新加入总数加入函数的时间。
最后,我希望这是一个自定义函数,因为我将不得不重复执行此任务。我想要一个像“Differences(Df1,Df2)”这样的函数......所以我可以输入两个数据帧,然后得到结果。
我希望这不是要求太多!我是自定义函数的新手,尤其是包含 dplyr 和管道的函数。
希望有人可以帮助我!
CaseWorker<-c("John","John","Kim")
Client<-c("Chris","Chris","Eric")
Task<-c("Feed cat","Make dinner","Do homework")
Date<-c("10/27/2016","09/22/2016","10/11/2016")
Df<-data.frame(CaseWorker,Client,Date,Task)
第二个数据帧...
CaseWorker<-c("John","John","John","John","John","John","John","John","John",
"John","Kim","Kim","Kim")
Client<-c("Chris","Chris","Chris","Chris","Chris","Chris","Chris","Chris","Chris","Chris","Eric","Eric","Eric")
Date<-c("11/10/2016","10/10/2016","11/13/2016","09/18/2016","11/11/2016","09/19/2016","08/08/2016","10/10/2016","08/05/2016","11/12/2016","09/09/2016","11/11/2016","09/10/2016")
Task2<-c("Feed cat","Feed cat","Feed cat","Feed cat","Feed cat","Make dinner","Make dinner","Make dinner","Iron shirt","Iron shirt","Do homework",
"Do homework","Buy groceries")
Time<-c(20,34,11,10,5,6,55,30,20,10,12,10,8)
Df2<-data.frame(CaseWorker,Client,Date,Task2,Time)
【问题讨论】: