逻辑下标太长答案

【问题标题】：Logical Subscript too long逻辑下标太长
【发布时间】：2015-04-25 05:36:00
【问题描述】：

我意识到以前有人问过这个问题，但是在查看了所有答案后，它们都是针对特定问题的，我找不到适合我独特情况的答案。

我在 R 中输入了以下内容，它适用于第一个示例，但不适用于第二个示例，我不明白为什么。

为 glm 设置数据：

setwd("P:/STAT319")
ucb2<-read.table('Berkeley.PoissonTwo.txt',header=TRUE)
attach(ucb2)

ucb2 如下：

Count   Admit Department    Gender     
313 FALSE     A     Female     
512 TRUE      A     Female     
19  FALSE     A     Male       
89  TRUE      A     Male       
207 FALSE     B     Female     
353 TRUE      B     Female     
8   FALSE     B     Male       
17  TRUE      B     Male       
205 FALSE     C     Female     
120 TRUE      C     Female     
391 FALSE     C     Male       
202 TRUE      C     Male       
279 FALSE     D     Female     
138 TRUE      D     Female     
244 FALSE     D     Male       
131 TRUE      D     Male       
138 FALSE     E     Female     
53  TRUE      E     Female     
299 FALSE     E     Male       
94  TRUE      E     Male       
351 FALSE   F       Female     
22  TRUE      F     Female     
317 FALSE     F     Male       
24  TRUE      F     Male

对 Admit 和 NotAdmit 使用因子变量 TRUE 和 FALSE：

Admit<-c(0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1)
fAdmit<-factor(Admit)
rAdmit<-factor(Admit,labels=c("FALSE","TRUE"))
glm2<-glm(Count~Admit+Department+Gender,family=poisson)
glm2

为遗漏交叉验证做准备

library(car)
vif(glm2)
# GVIF Df GVIF^(1/(2*Df))
# Admit         1  1               1
# Department    1  5               1
# Gender        1  1               1
step(glm2)
# Start:  AIC=2272.73
# Count ~ Admit + Department + Gender
# 
# Df Deviance    AIC
# <none>            2097.7 2272.7
# - Department  5   2257.2 2422.2
# - Gender      1   2260.6 2433.6
# - Admit       1   2327.7 2500.8
# 
# Call:  glm(formula = Count ~ Admit + Department + Gender, family = poisson)
# 
# Coefficients:
#   (Intercept)        Admit  DepartmentB  DepartmentC  
# 5.82785     -0.45674     -0.46679     -0.01621  
# DepartmentD  DepartmentE  DepartmentF   GenderMale  
# -0.16384     -0.46850     -0.26752     -0.38287  

# Degrees of Freedom: 23 Total (i.e. Null);  16 Residual
# Null Deviance:        2650 
# Residual Deviance: 2098   AIC: 2273

library(ipred)
errorest(Count~Admit+Department+Gender,data=ucb2,model=glm,est.para=control.errorest(k=24))

# Call:
#   errorest.data.frame(formula = Count ~ Admit + Department + Gender, 
#                       data = ucb2, model = glm, est.para = control.errorest(k = # 24))
# 
# 24-fold cross-validation estimator of root mean squared error
# 
# Root mean squared error:  180.5741

所以第一个处理数据如图所示。现在做同样的研究，我们必须重新排列数据，并执行逻辑回归：

ucb1<-read.table('Monday.Late.txt',header=TRUE)
attach(ucb1)
# The following object is masked _by_ .GlobalEnv:
#   
#   Admit

# The following objects are masked from ucb2:
#   
#   Admit, Department, Gender

y<-cbind(ucb1[,1],ucb1[,2])
glm1<-glm(y~Gender+Department,family=binomial)

相关数据如下：

Admit   NotAdmit    Gender  Department     
512 313 female  a      
353 207 female  b      
120 205 female  c      
138 279 female  d      
53  138 female  e      
22  351 female  f      
89  19  male    a      
17  8   male    b      
202 391 male    c      
131 244 male    d      
94  299 male    e      
24  317 male    f

设置这个新数据以留出一个：

vif(glm1)
# GVIF Df GVIF^(1/(2*Df))
# Gender     1.384903  1        1.176819
# Department 1.384903  5        1.033099
step(glm1)
# Start:  AIC=103.14
# y ~ Gender + Department

# Df Deviance    AIC
# - Gender      1    21.74 102.68
# <none>             20.20 103.14
# - Department  5   783.61 856.55
# 
# Step:  AIC=102.68
# y ~ Department
# 
# Df Deviance    AIC
# <none>             21.74 102.68
# - Department  5   877.06 948.00
# 
# Call:  glm(formula = y ~ Department, family = binomial)
# 
# Coefficients:
#   (Intercept)  Departmentb  Departmentc  Departmentd  
# 0.59346     -0.05059     -1.20915     -1.25833  
# Departmente  Departmentf  
# -1.68296     -3.26911  
# 
# Degrees of Freedom: 11 Total (i.e. Null);  6 Residual
# Null Deviance:        877.1 
# Residual Deviance: 21.74  AIC: 102.7

到目前为止，一切都很好，但现在问题出现了：

errorest(y~Gender+Department,data=ucb1,model=glm,est.para=control.errorest(k=12))
Error in xj[i, , drop = FALSE] : (subscript) logical subscript too long

那么为什么会这样呢？我尝试了 k 的其他值，不确定 k 是什么值 # 意味着采取 - 我认为它意味着是行数

然后我尝试相同的数据，以不同的方式排列：

ucb1a<-read.table('Berkeley.Rearranged.txt',header=TRUE)
attach(ucb1a)
ucb1a

这是对之前数据的重新排列

Admitted Not_Admit Depart Genders
1       512       313      A  Female
2        89        19      A    Male
3       353       207      B  Female
4        17         8      B    Male
5       120       205      C  Female
6       202       391      C    Male
7       138       279      D  Female
8       131       244      D    Male
9        53       138      E  Female
10       94       299      E    Male
11       22       351      F  Female
12       24       317      F    Male

然后

y<-cbind(ucb1[,1],ucb1[,2])
glm1a<-glm(y~Genders+Depart,family=binomial)
vif(glm1a)
# GVIF Df GVIF^(1/(2*Df))
# Gender     1.384903  1        1.176819
# Department 1.384903  5        1.033099

step(glm1a)
# Start:  AIC=103.14
# y ~ Gender + Department
# 
# Df Deviance    AIC
# - Gender      1    21.74 102.68
# <none>             20.20 103.14
# - Department  5   783.61 856.55
# 
# Step:  AIC=102.68
# y ~ Department
# 
# Df Deviance    AIC
# <none>             21.74 102.68
# - Department  5   877.06 948.00
# 
# Call:  glm(formula = y ~ Department, family = binomial)
# 
# Coefficients:
#   (Intercept)  Departmentb  Departmentc  Departmentd  
# 0.59346     -0.05059     -1.20915     -1.25833  
# Departmente  Departmentf  
# -1.68296     -3.26911  
# 
# Degrees of Freedom: 11 Total (i.e. Null);  6 Residual
# Null Deviance:        877.1 
# Residual Deviance: 21.74  AIC: 102.7

再一次，到目前为止一切都很好，但又一次发生了这种情况：

errorest(y~Gender+Department,data=ucb1a,model=glm,est.para=control.errorest(k=12))
Error in xj[i, , drop = FALSE] : (subscript) logical subscript too long

相信我，我又为 k 尝试了其他数字，但我不明白为什么这个会出错。所以如果大家有什么想法，对于这个（下标）逻辑下标太长的具体例子，请回复。

【问题讨论】：

attach 你的数据集的原因是什么？
这是我们在 Stats 类中被告知的方式。我听说过使用命令'with'-您还建议做什么？谢谢。克里斯·莉莉。
@ChristopherBrentLilly 你找到解决方案了吗？我也面临同样的问题！

标签： r logical-operators subscript

【解决方案1】：

当您的对象大小不同时，就会出现此问题。我认为您的问题来自 attach() 但我不确定.. 尝试没有它的代码，或者您可以尝试 with()。正如 nicola 指出的那样，您应该先检查为什么必须先使用 attach() 才能使用它。另外，我不确定你想用它实现什么。

您可以在函数的帮助部分看到以下内容：好习惯

attach 具有改变搜索路径的副作用，这很容易导致找到特定名称的错误对象。人们确实经常忘记分离数据库。

在交互使用中，with 通常比使用附加/分离更可取，除非什么是 save() 生成的文件，在这种情况下，attach() 是 load() 的（安全）包装器。

【讨论】：