【发布时间】:2019-11-02 03:31:15
【问题描述】:
我正在阅读“Hands-on Time series analysis with R”一书,我被困在使用机器学习 h2o 包的示例中。我不知道如何使用 h2o.predict 函数。在示例中,它需要 newdata 参数,在这种情况下是测试数据。但是,如果您实际上不知道这些值,您如何预测时间序列的未来值呢?
如果我只是忽略 newdata 参数,我会得到:缺少 newdata 参数的预测尚未实现。
library(h2o)
h2o.init(max_mem_size = "16G")
train_h <- as.h2o(train_df)
test_h <- as.h2o(test_df)
forecast_h <- as.h2o(forecast_df)
x <- c("month", "lag12", "trend", "trend_sqr")
y <- "y"
rf_md <- h2o.randomForest(training_frame = train_h,
nfolds = 5,
x = x,
y = y,
ntrees = 500,
stopping_rounds = 10,
stopping_metric = "RMSE",
score_each_iteration = TRUE,
stopping_tolerance = 0.0001,
seed = 1234)
h2o.varimp_plot(rf_md)
rf_md@model$model_summary
library(plotly)
tree_score <- rf_md@model$scoring_history$training_rmse
plot_ly(x = seq_along(tree_score), y = tree_score,
type = "scatter", mode = "line") %>%
layout(title = "Random Forest Model - Trained Score History",
yaxis = list(title = "RMSE"),
xaxis = list(title = "Num. of Trees"))
test_h$pred_rf <- h2o.predict(rf_md, test_h)
test_1 <- as.data.frame(test_h)
mape_rf <- mean(abs(test_1$y - test_1$pred_rf) / test_1$y)
mape_rf
【问题讨论】:
标签: r machine-learning h2o