【发布时间】:2018-06-25 20:11:49
【问题描述】:
我正在尝试将数据帧中的两列与另一个数据帧匹配,并且我希望返回的值是第二个数据帧中首先匹配两个初始列的值。
例如: 我想采用以下数据框:
Fasta<-c("X1","X1","X2","X2","X3","X3")
Species<-c("Kiwi","Chicken","Weta","Cricket","Tuatara","Gecko")
testdata<-as.data.frame(cbind(Fasta,Species))
testdata<-aggregate(Species ~ Fasta, testdata, I)
testdata<-aggregate(Species ~ Fasta, testdata, I)
Fasta Species1 Species2
X1 Kiwi Chicken
X2 Weta Cricket
X3 Tuatara Gecko
以下是我的第二个数据框
Species<-c("Kiwi","Chicken","Weta","Cricket","Frog","Gecko")
Genus<-c("Orn","Norn","Genus2","Genus2","Spec","NoSpec")
Order<-c("Bird","Bird","Order2","Order2","Norder","Geckn")
Kingdom<-rep("Animal",each=6)
lookup<-data.frame(cbind(Species,Genus,Order,Kingdom))
Species Genus Order Kingdom
Kiwi Orn Bird Animal
Chicken Norn Bird Animal
Weta Genus2 Order2 Animal
Cricket Genus2 Order2 Animal
Frog Spec Norder Animal
Gecko NoSpec Geckn Animal
我想在第二个数据框中找到匹配 Species1 和 Species2 的第一列并返回其名称。理想情况下,这会给我以下输出:
Fasta Species1 Species2 MatchLevel
X1 Kiwi Chicken Order
X2 Weta Cricket Genus
X3 Tuatara Gecko Kingdom
对不同格式的数据开放,
【问题讨论】:
-
testdata$MatchLevel <- mapply(function(s1, s2){names(lookup)[which(unlist(lookup[s1 == lookup$Species, ]) == unlist(lookup[s2 == lookup$Species, ]))[1]]}, testdata$Species1, testdata$Species2),虽然我怀疑还有更优雅的选择
标签: r statistics bioinformatics