【发布时间】:2015-09-17 06:09:21
【问题描述】:
dplyr 中的 do-function 可以让您快速轻松地制作许多很酷的模型,但我正在努力使用这些模型来进行良好的滚动预测。
# Data illustration
require(dplyr)
require(forecast)
df <- data.frame(
Date = seq.POSIXt(from = as.POSIXct("2015-01-01 00:00:00"),
to = as.POSIXct("2015-06-30 00:00:00"), by = "hour"))
df <- df %>% mutate(Hour = as.numeric(format(Date, "%H")) + 1,
Wind = runif(4320, min = 1, max = 5000),
Temp = runif(4320, min = - 20, max = 25),
Price = runif(4320, min = -15, max = 45)
)
我的因子变量是Hour,我的外生变量是Wind和temp,我要预测的是Price。所以,基本上,我有 24 个模型可以用来进行滚动预测。
现在,我的数据框包含 180 天。我想回到 100 天,做一个 1 天的滚动预测,然后能够将其与实际的 Price 进行比较。
这种蛮力操作看起来像这样:
# First I fit the data frame to be exactly the right length
# 100 days to start with (2015-03-21 or so), then 99, then 98.., etc.
n <- 100 * 24
# Make the price <- NA so I can replace it with a forecast
df$Price[(nrow(df) - n): (nrow(df) - n + 24)] <- NA
# Now I make df just 81 days long, the estimation period + the first forecast
df <- df[1 : (nrow(df) - n + 24), ]
# The actual do & fit, later termed fx(df)
result <- df %>% group_by(Hour) %>% do ({
historical <- .[!is.na(.$Price), ]
forecasted <- .[is.na(.$Price), c("Date", "Hour", "Wind", "Temp")]
fit <- Arima(historical$Price, xreg = historical[, 3:4], order = c(1, 1, 0))
data.frame(forecasted[],
Price = forecast.Arima(fit, xreg = forecasted[3:4])$mean )
})
result
现在我将 n 更改为 99 * 24。但是将它放在循环或应用中会很棒,但我根本不知道该怎么做,还要保存每个新的预测。
我试过这样的循环,但还没有运气:
# 100 days ago, forecast that day, then the next, etc.
for (n in 1:100) {
nx <- n * 24 * 80 # Because I want to start after 80 days
df[nx:(nx + 23), 5] <- NA # Set prices to NA so I can forecast them
fx(df) # do the function
df.results[n] <- # Write the results into a vector / data frame to save them
# and now rinse and repeat for n + 1
}
broom-like 解决方案的真正精彩奖励积分 :)
【问题讨论】:
标签: r dplyr apply forecasting