【发布时间】:2016-04-16 13:38:26
【问题描述】:
我想对我正在使用的数据框 project 进行子集化,使用逻辑。我得到了一个矛盾的结果。 ROLL.NO. 参数之前的逻辑部分与问题无关。抱歉,我无法给出可重现的示例。请告诉我如何使这个问题可重现,而不必在我的数据框中显示相关列的全部 393 个条目。D14 和 DC31 是简单的整数值,其中一些值是 NA。
culprits<-project$ROLL.NO.[(project$DC31==1&project$D14==2)|(project$DC31==2&project$D14==1)&!is.na(project$DC31)&!is.na(project$D14)]
culprits
[1] 3138 3129 3129 3135 3135 3136 3120 3126 3133 3125 3125 3125 3132 3132 3123 3123 3131
project$HOUSE.NO[(project$DC31==1&project$D14==2)|(project$DC31==2&project$D14==1)&!is.na(project$DC31)&!is.na(project$D14)&project$ROLL.NO.==3131]
[1] "14/132" "14/176" "16/133" "14/111" "14/252"
> project$HOUSE.NO[(project$DC31==1&project$D14==2)|(project$DC31==2&project$D14==1)&!is.na(project$DC31)&!is.na(project$D14)&project$ROLL.NO.==3129]
[1] "14/132" "15/162" "14/176" "16/133" "14/111"
> project$ROLL.NO.[(project$DC31==1&project$D14==2)|(project$DC31==2&project$D14==1)&!is.na(project$DC31)&!is.na(project$D14)&project$ROLL.NO.==3136]
[1] 3129 3136 3120 3123 3123
project$ROLL.NO.[(project$DC31==1&project$D14==2)|(project$DC31==2&project$D14==1)&!is.na(project$DC31)&!is.na(project$D14)&project$ROLL.NO.==3125]
[1] 3129 3120 3125 3125 3125 3123 3123
project$ROLL.NO.[project$ROLL.NO.==3136]
[1] 3136 3136 3136 3136 3136 3136 3136 3136 3136
我试图了解我的代码出了什么问题,并且我还包含了这些查询的结果。当project$ROLL.NO.==3136 是FALSE 对于任何其他ROLL.NO. 时,我看不出为什么在其他参数与& 一起添加时调用其他ROLL.NO.。此外,相同的三个条目与任何称为 ROLL.NO. 的条目错误地重复。ROLL.NO. 列中没有 NA 值。并且每个条件下的逻辑向量的长度是相同的,因此没有回收。如果需要提供其他信息,请告诉我。
附录
project <- structure(list(ROLL.NO. = c(3138L, 3138L, 3138L, 3138L, 3138L,
3138L, 3138L, 3138L, 3138L, 3138L, 3138L, 3138L, 3138L, 3138L,
3138L, 3138L, 3138L, 3138L, 3138L, 3138L, 3138L, 3129L, 3129L,
3129L, 3129L, 3129L, 3129L, 3129L, 3129L, 3129L, 3129L, 3129L,
3129L, 3129L, 3129L, 3129L, 3129L, 3129L, 3129L, 3129L, 3129L,
3129L, 3129L, 3129L, 3121L, 3121L, 3121L, 3121L, 3121L, 3121L
), DC31 = c(2L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 2L,
1L, 2L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L,
2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L,
1L, 2L, 2L, 2L, 2L), D14 = c(2L, 2L, 1L, 2L, 2L, 1L, 1L, 2L,
1L, 2L, 1L, 2L, 0L, 1L, 2L, 2L, 0L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L,
2L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 2L), HOUSE.NO = c("14/274",
"14/259", "14/217", "14/258", "14/306", "14/300", "14/96", "14/166",
"14/69", "14/68", "14/16", "14/93", "14/130", "14/321", "14/324",
"14/139", "14/314", "14/323", "14/208", "14/78", "14/150", "14/155",
"14/102", "14/132", "14/159", "14/163", "14/165", "14/146", "14/148",
"14/104", "14/56", "14/53", "14/99", "14/48", "15/164", "15/148",
"15/158", "15/107", "15/160", "15/162", "15/243", "15/66", "15/249",
"15/86", "14/388", "14/396", "14/431", "14/401", "14/103", "15/36"
)), .Names = c("ROLL.NO.", "DC31", "D14", "HOUSE.NO"), row.names = c(NA,
50L), class = "data.frame")
【问题讨论】:
-
@rawr 添加到问题中。不确定是否包括异常。
-
所以基本上你在做
TRUE | TRUE & FALSE你希望这是错误的(但它是真的),但你真正想要的是(TRUE | TRUE) & FALSE这是错误的? -
为什么前者会是真的?
-
from
?`|`,See ?Syntax for the precedence of these operators: unlike many other languages (including S) the AND and OR operators do not have the same precedence (the AND operators have higher precedence than the OR operators).所以对于TRUE | TRUE & FALSE,TRUE & FALSE首先被求值,表达式变为TRUE | FALSE -
好像不是,只是指出来,既然你说你有NA,你应该意识到这一点。您需要使用
is.na测试NA,因为==将无法按预期工作