【发布时间】:2014-09-14 18:59:17
【问题描述】:
使用下面称为“数据”的数据框,我可以直接将值分配给两个变量“状态”和“测量”,并确定子集中得分最低的学校:
创建数据框“数据”:
school<-c("NYU", "BYU", "USC", "FIT", "Oswego","UCLA","USF","Columbia")
state<-c("NY","UT","CA","NY","NY","CA", "CA","NY")
measure<-c("MSAT","MSAT","GPA","MSAT","MSAT","GPA","GPA","GPA")
score<-c(590, 490, 2.9, 759, 550, 1.2, 3.1, 3.2)
data<-data.frame(school,state, measure,score)
“状态”和“测量”的子集:
answer<-subset(data,subset=(state=="NY" & measure=="MSAT"))
order.answer<-order(answer$score,answer$school) #answer$school is tie-breaker
answer1<-as.matrix(answer[order.answer,])
answer1[1,1]
这是正确答案:
[1] "Oswego"
我的问题是当我创建一个函数来完成同样的事情时,我得到了一个不正确的结果:
lowest <- function(state, measure){
answer<-subset(data,subset=(state==state & measure==measure))
order.answer<-order(answer$score,answer$school)
answer1<-as.matrix(answer[order.answer,])
answer1[1,1]
}
lowest("NY","MSAT")
错误答案:
[1] "UCLA"
问题似乎是变量“状态”和“测量”不采用函数子集行中参数“NY”和“MSAT”的值。我已经尝试使用 '=' 而不是 '==' 并且还尝试了子集(数据,子集=(状态==“状态” & measure==“测量”)),但找不到解决方案。
【问题讨论】:
标签: r function subset argument-passing