【发布时间】:2018-08-14 17:44:03
【问题描述】:
R 版本: R 版本 3.5.1 (2018-07-02)
H2O 集群版本: 3.20.0.2
此处使用的数据集可在 Kaggle(家庭信用风险)上获得。在使用 h2o automl 之前,已经对缺失值和相关分类变量的选择进行了必要的处理。你能帮我弄清楚这个错误的根本原因是什么吗? 谢谢
代码:
h2o.init()
h2o.no_progress()
# y_train_processed_tbl is the target variable
# x_train_processed_tbl is the remaining data post dealing with Missing
# values
data_h2o <- as.h2o(bind_cols(y_train_processed_tbl, x_train_processed_tbl))
splits_h2o <- h2o.splitFrame(data_h2o, ratios = c(0.7, 0.15), seed = 1234)
train_h2o <- splits_h2o[[1]]
valid_h2o <- splits_h2o[[2]]
test_h2o <- splits_h2o[[3]]
y <- "TARGET"
x <- setdiff(names(train_h2o), y)
automl_models_h2o <- h2o.automl(x = x,y = y,
training_frame = train_h2o, validation_frame = valid_h2o,
leaderboard_frame = test_h2o,
max_runtime_secs = 90
)
automl_leader <- automl_models_h2o@leader
# Error in performance_h2o
performance_h2o <- h2o.performance(automl_leader, newdata = test_h2o)
ERROR: Unexpected HTTP Status code: 404 Not Found
water.exceptions.H2OKeyNotFoundArgumentException
[1] "water.exceptions.H2OKeyNotFoundArgumentException: Object 'dummy' not
found in function: predict for argument: model"
[2] " water.api.ModelMetricsHandler.score(ModelMetricsHandler.java:235)"
[3] " sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)"
[4] " sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)"
[5] " sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)"
[6] " java.lang.reflect.Method.invoke(Unknown Source)"
[7] " water.api.Handler.handle(Handler.java:63)"
[8] " water.api.RequestServer.serve(RequestServer.java:451)"
[9] " water.api.RequestServer.doGeneric(RequestServer.java:296)"
[10] " water.api.RequestServer.doPost(RequestServer.java:222)"
[11] " javax.servlet.http.HttpServlet.service(HttpServlet.java:755)"
[12] " javax.servlet.http.HttpServlet.service(HttpServlet.java:848)"
[13] " org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)"
[14] " org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:503)"
[15] " org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)"
[16] " org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:429)"
[17] " org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)"
[18] " org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)"
[19] " org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)"
[20] " org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)"
[21] " water.JettyHTTPD$LoginHandler.handle(JettyHTTPD.java:197)"
[22] " org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)"
[23] " org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)"
[24] " org.eclipse.jetty.server.Server.handle(Server.java:370)"
[25] " org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)"
[26] " org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)"
[27] " org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:982)"
[28] " org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1043)"
[29] " org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:865)"
[30] " org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)"
[31] " org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)"
[32] " org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)"
[33] " org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)"
[34] " org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)"
[35] " java.lang.Thread.run(Unknown Source)"
Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix =
page, :
ERROR MESSAGE:
Object 'dummy' not found in function: predict for argument: model
【问题讨论】:
-
您是否能够毫无问题地运行文档中的以下 automl 示例代码:docs.h2o.ai/h2o/latest-stable/h2o-docs/…?
-
@Lauren,是的,文档中的 automl 示例代码运行没有任何问题。
-
@divibisan,我知道 URL 中可能有一些拼写错误,但如果是这种情况,它是 H2O 内部生成的链接。
-
我只是想评论一下,在使用 H2O 时,您不需要估算缺失数据,也不需要编码分类变量。它是自动执行的。如果您手动对分类列进行虚拟/单热编码,则性能可能会更差。我不会推荐它。
-
@ErinLeDell 感谢您的指导。只是为了澄清我只删除了很少级别的分类变量。这些治疗的想法最初是在我使用 H2O 之前尝试其他一些模型时进行的。