【发布时间】:2021-02-22 23:58:48
【问题描述】:
尝试处理我书中的这段代码(它不断给我不工作的代码)
#### Table 10.6
delays.df <- read.csv("/Users/CHAPTER 10 ASSIGNMENT 6/FlightDelays.csv")
head(delays.df)
# transform variables and create bins
delays.df$DAY_WEEK <- factor(delays.df$DAY_WEEK, levels = c(1:7),
labels = c("Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"))
delays.df$CRS_DEP_TIME <- factor(round(delays.df$CRS_DEP_TIME/100))
head(delays.df)
DAY_WEEK CRS_DEP_TIME
<fctr> <fctr>
# create reference categories
delays.df$ORIGIN <- relevel(delays.df$ORIGIN, ref = "IAD")
delays.df$DEST <- relevel(delays.df$DEST, ref = "LGA")
delays.df$CARRIER <- relevel(delays.df$CARRIER, ref = "US")
delays.df$DAY_WEEK <- relevel(delays.df$DAY_WEEK, ref = "Wed")
delays.df$isDelay <- 1 * (delays.df$Flight.Status == "delayed")
# create training and validation sets
selected.var <- c(10, 1, 8, 4, 2, 9, 14)
train.index <- sample(c(1:dim(delays.df)[1]), dim(delays.df)[1]*0.6)
train.df <- delays.df[train.index, selected.var]
valid.df <- delays.df[-train.index, selected.var]
# run logistic model, and show coefficients and odds
lm.fit <- glm(isDelay ~ ., data = train.df, family = "binomial")
data.frame(summary(lm.fit)$coefficients, odds = exp(coef(lm.fit)))
round(data.frame(summary(lm.fit)$coefficients, odds = exp(coef(lm.fit))), 5)
由于以下错误,我无法通过任何 relevel 命令:
delays.df$ORIGIN <- relevel(delays.df$ORIGIN, ref = "IAD")
relevel.default(delays.df$ORIGIN, ref = "IAD") 中的错误: 'relevel' 仅适用于(无序的)因素
我的理解是我的变量是因子,relevel 命令不适用于我的因子。我不明白如何解决它?如果不是这种情况,我真的不明白如何解决它。
一如既往,感谢任何帮助或见解。
【问题讨论】:
-
好吧,对于
delays.df$DAY_WEEK,您指定了级别(即levels = c(1:7)),因此您不需要“重新调整”它。也许尝试注释掉delays.df$DAY_WEEK <- relevel(delays.df$DAY_WEEK, ref = "Wed")行,或者从原始命令中删除levels = c(1:7)? -
从您的
head电话中,delays.df$ORIGIN不存在。所以你可能想在NULL上relevel。 -
@AlvaroMartinez - 谢谢
-
这里发生了一堆无法重现的事情。第一行使用
read.csv()读取文件,这应该会产生一个普通的base-R 数据帧,而不是一个tibble。第二个head()调用打印出数据类型(例如<fctr>),这意味着该对象现在是一个小标题。我认为这里没有向我们展示完整/可重现的工作流程。
标签: r logistic-regression