【问题标题】:deploy predictionio with spark standalone cluster使用 spark 独立集群部署 predictionio
【发布时间】:2018-05-09 14:58:12
【问题描述】:

我用官方Recommendation作为测试。 我成功地完成了这些步骤:

  1. 事件服务器安装在 docker 容器中。(成功)
  2. 配置事件数据,元数据和所有东西都存储在mysql中。(成功)
  3. 在另一个 docker 容器中训练和部署服务器。(成功)
  4. spark 独立集群安装在另一个容器中。(成功)
  5. 创建新应用。(成功)
  6. 导入足够的事件数据。(成功)

当我按如下方式训练和部署时,就像文档描述的那样:

pio train
pio deploy

但是当我使用spark集群时,如下训练和部署,train是可以的(新模型已经存储在mysql中),但是deploy不成功。

pio train -v engine.json -- --master spark://predictionspark:7077 --executor-memory 2G --driver-memory 2G --total-executor-cores 1
pio deploy -v engine.json --feedback --event-server-ip predictionevent --event-server-port 7070 --accesskey Th7k5gE5yEu9ZdTdM6KdAj0InDrLNJQ1U3qEBy7dbMnYgTxWx5ALNAa2hKjqaHSK -- --master spark://predictionspark:7077 --executor-memory 2G --driver-memory 2G --total-executor-cores 1

所有日志:

[INFO] [Runner$] Submission command: /spark/bin/spark-submit --master spark://predictionspark:7077 --executor-memory 2G --driver-memory 2G --total-executor-cores 1 --class org.apache.predictionio.workflow.CreateWorkflow --jars file:/PredictionIO/lib/mysql-connector-java-5.1.46.jar,file:/ebsa/app/cf/target/scala-2.11/template-scala-parallel-recommendation_2.11-0.1-SNAPSHOT.jar,file:/ebsa/app/cf/target/scala-2.11/template-scala-parallel-recommendation-assembly-0.1-SNAPSHOT-deps.jar,file:/PredictionIO/lib/spark/pio-data-elasticsearch-assembly-0.12.1.jar,file:/PredictionIO/lib/spark/pio-data-hbase-assembly-0.12.1.jar,file:/PredictionIO/lib/spark/pio-data-jdbc-assembly-0.12.1.jar,file:/PredictionIO/lib/spark/pio-data-localfs-assembly-0.12.1.jar,file:/PredictionIO/lib/spark/pio-data-s3-assembly-0.12.1.jar,file:/PredictionIO/lib/spark/pio-data-hdfs-assembly-0.12.1.jar --files file:/PredictionIO/conf/log4j.properties --driver-class-path /PredictionIO/conf:/PredictionIO/lib/mysql-connector-java-5.1.46.jar --driver-java-options -Dpio.log.dir=/root file:/PredictionIO/lib/pio-assembly-0.12.1.jar --engine-id org.example.recommendation.RecommendationEngine --engine-version 0387c097c02018fa29109a8990b03d163249be00 --engine-variant file:/ebsa/app/cf/engine.json --verbosity 0 --json-extractor Both --env PIO_ENV_LOADED=1,PIO_STORAGE_SOURCES_MYSQL_PASSWORD=***,PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta,PIO_FS_BASEDIR=/root/.pio_store,PIO_STORAGE_SOURCES_MYSQL_URL=jdbc:mysql://***:3306/predictionio,PIO_HOME=/PredictionIO,PIO_FS_ENGINESDIR=/root/.pio_store/engines,PIO_STORAGE_SOURCES_MYSQL_TYPE=jdbc,PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=MYSQL,PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=MYSQL,PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event,PIO_STORAGE_SOURCES_MYSQL_USERNAME=***,PIO_FS_TMPDIR=/root/.pio_store/tmp,PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model,PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=MYSQL,PIO_CONF_DIR=/PredictionIO/conf
[INFO] [Engine] Extracting datasource params...
[INFO] [WorkflowUtils$] No 'name' is found. Default empty String will be used.
[INFO] [Engine] Datasource params: (,DataSourceParams(cf,None))
[INFO] [Engine] Extracting preparator params...
[INFO] [Engine] Preparator params: (,Empty)
[INFO] [Engine] Extracting serving params...
[INFO] [Engine] Serving params: (,Empty)
[INFO] [log] Logging initialized @6069ms
[INFO] [Server] jetty-9.3.z-SNAPSHOT
[INFO] [Server] Started @6184ms
[WARN] [Utils] Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
[INFO] [AbstractConnector] Started ServerConnector@2b53840a{HTTP/1.1,[http/1.1]}{0.0.0.0:4041}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@422ad5e2{/jobs,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@1b3ab4f9{/jobs/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@1c8f6c66{/jobs/job,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@151732fb{/jobs/job/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@40ed1802{/stages,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@feb098f{/stages/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@31e739bf{/stages/stage,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@7f42e06e{/stages/stage/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@2efd2f21{/stages/pool,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@316cda31{/stages/pool/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@17d2b075{/storage,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@310b2b6f{/storage/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@6b5ab2f2{/storage/rdd,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@6b2dd3df{/storage/rdd/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@73c48264{/environment,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@5bcec67e{/environment/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@7a2fce12{/executors,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@4bb1b96b{/executors/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@1f66d8e1{/executors/threadDump,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@3421debd{/executors/threadDump/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@68b7d0ef{/static,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@319642db{/,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@35bfa1bb{/api,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@2eda4eeb{/jobs/job/kill,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@309dcdf3{/stages/stage/kill,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@6a2d867d{/metrics/json,null,AVAILABLE,@Spark}
[INFO] [Engine$] EngineWorkflow.train
[INFO] [Engine$] DataSource: org.example.recommendation.DataSource@5db3d57c
[INFO] [Engine$] Preparator: org.example.recommendation.Preparator@395f52ed
[INFO] [Engine$] AlgorithmList: List(org.example.recommendation.ALSAlgorithm@26e0d39c)
[INFO] [Engine$] Data sanity check is on.
[INFO] [Engine$] org.example.recommendation.TrainingData does not support data sanity check. Skipping check.
[INFO] [Engine$] org.example.recommendation.PreparedData does not support data sanity check. Skipping check.
[WARN] [BLAS] Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS
[WARN] [BLAS] Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS
[INFO] [Engine$] org.apache.spark.mllib.recommendation.ALSModel does not support data sanity check. Skipping check.
[INFO] [Engine$] EngineWorkflow.train completed
[INFO] [Engine] engineInstanceId=0ac606dc-9959-40f8-9f40-d32354ebf221
[WARN] [TaskSetManager] Stage 1403 contains a task of very large size (1217 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 1404 contains a task of very large size (1767 KB). The maximum recommended task size is 100 KB.
[INFO] [CoreWorkflow$] Inserting persistent model
[INFO] [CoreWorkflow$] Updating engine instance
[INFO] [CoreWorkflow$] Training completed successfully.
[INFO] [AbstractConnector] Stopped Spark@2b53840a{HTTP/1.1,[http/1.1]}{0.0.0.0:4041}
[INFO] [Runner$] Submission command: /spark/bin/spark-submit --master spark://predictionspark:7077 --executor-memory 2G --driver-memory 2G --total-executor-cores 1 --class org.apache.predictionio.workflow.CreateServer --jars file:/PredictionIO/lib/mysql-connector-java-5.1.46.jar,file:/ebsa/app/cf/target/scala-2.11/template-scala-parallel-recommendation_2.11-0.1-SNAPSHOT.jar,file:/ebsa/app/cf/target/scala-2.11/template-scala-parallel-recommendation-assembly-0.1-SNAPSHOT-deps.jar,file:/PredictionIO/lib/spark/pio-data-elasticsearch-assembly-0.12.1.jar,file:/PredictionIO/lib/spark/pio-data-hbase-assembly-0.12.1.jar,file:/PredictionIO/lib/spark/pio-data-jdbc-assembly-0.12.1.jar,file:/PredictionIO/lib/spark/pio-data-localfs-assembly-0.12.1.jar,file:/PredictionIO/lib/spark/pio-data-s3-assembly-0.12.1.jar,file:/PredictionIO/lib/spark/pio-data-hdfs-assembly-0.12.1.jar --files file:/PredictionIO/conf/log4j.properties --driver-class-path /PredictionIO/conf:/PredictionIO/lib/mysql-connector-java-5.1.46.jar --driver-java-options -Dpio.log.dir=/root file:/PredictionIO/lib/pio-assembly-0.12.1.jar --engineInstanceId 0ac606dc-9959-40f8-9f40-d32354ebf221 --engine-variant file:/ebsa/app/cf/engine.json --ip 0.0.0.0 --port 8000 --event-server-ip predictionevent --event-server-port 7070 --accesskey Th7k5gE5yEu9ZdTdM6KdAj0InDrLNJQ1U3qEBy7dbMnYgTxWx5ALNAa2hKjqaHSK --feedback --json-extractor Both --env PIO_ENV_LOADED=1,PIO_STORAGE_SOURCES_MYSQL_PASSWORD=***,PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta,PIO_FS_BASEDIR=/root/.pio_store,PIO_STORAGE_SOURCES_MYSQL_URL=jdbc:mysql://***:3306/predictionio,PIO_HOME=/PredictionIO,PIO_FS_ENGINESDIR=/root/.pio_store/engines,PIO_STORAGE_SOURCES_MYSQL_TYPE=jdbc,PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=MYSQL,PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=MYSQL,PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event,PIO_STORAGE_SOURCES_MYSQL_USERNAME=***,PIO_FS_TMPDIR=/root/.pio_store/tmp,PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model,PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=MYSQL,PIO_CONF_DIR=/PredictionIO/conf
[WARN] [WorkflowUtils$] Non-empty parameters supplied to org.example.recommendation.Preparator, but its constructor does not accept any arguments. Stubbing with empty parameters.
[WARN] [WorkflowUtils$] Non-empty parameters supplied to org.example.recommendation.Serving, but its constructor does not accept any arguments. Stubbing with empty parameters.
[INFO] [log] Logging initialized @6953ms
[INFO] [Server] jetty-9.3.z-SNAPSHOT
[INFO] [Server] Started @7086ms
[WARN] [Utils] Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
[INFO] [AbstractConnector] Started ServerConnector@d8ed4d9{HTTP/1.1,[http/1.1]}{0.0.0.0:4041}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@307b5956{/jobs,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@62d50094{/jobs/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@8a644df{/jobs/job,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@5a9054e7{/jobs/job/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@15402e55{/stages,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@295b1de5{/stages/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@e7ac843{/stages/stage,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@da15f73{/stages/stage/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@506a8fc2{/stages/pool,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@fc4cf4d{/stages/pool/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@255cef05{/storage,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@4e8f6bce{/storage/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@4e6427d4{/storage/rdd,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@5fca5109{/storage/rdd/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@2acbd47f{/environment,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@39004878{/environment/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@785b7109{/executors,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@f0fce80{/executors/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@19ab67fc{/executors/threadDump,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@644558a3{/executors/threadDump/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@40fa6a20{/static,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@238b2adb{/,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@4bbba0ce{/api,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@3d1e4c06{/jobs/job/kill,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@70f8bf47{/stages/stage/kill,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@558311ee{/metrics/json,null,AVAILABLE,@Spark}
[INFO] [Engine] Using persisted model
[INFO] [Engine] Custom-persisted model detected for algorithm org.example.recommendation.ALSAlgorithm
[ERROR] [OneForOneStrategy] empty collection

我不知道为什么。

更多: 我没有使用安装在另一个 docker 容器上的独立集群,而是使用带有 train&deploy 服务器(@user2906838 mentioned)的同一容器上的 spark 启动了一个本地集群,它成功了。 我不明白为什么会这样。我不能使用本地火花,这很奇怪。


更多:

/tmp文件夹,两种情况下,文件大小不同。

success image failed image

更多:

这很有趣。我在 spark-worker 容器中找到模型数据。

spark-worker image

【问题讨论】:

  • 能否提供更多日志,请提供部署控制台的完整堆栈跟踪。如果有帮助,您也可以看到:medium.freecodecamp.org/…。我也提供了 spark 示例。
  • @user2906838 我已经粘贴了所有的日志。期待您的回复,谢谢。
  • 我认为,火车本身有问题,您确定在训练时保存模型的任何地方都有正确的权限吗? [错误] [OneForOneStrategy] 空集合,这并不能说明什么,但就我的经验而言,我想在保存模型时出了点问题。
  • @user2906838 我配置存储在mysql中的模型。并且经过训练后,我可以在表格pio_model_models中找到一个新的模型数据。
  • 在这种情况下,这不是解决您当前问题的方法,但您为什么不尝试简单安装,即不使用 docker,并且据我所知,模型实际上存储在文件,在指定目录中。我的 pip deploy 也遇到了一些问题,在我的情况下,只是保存在 /tmp 文件夹中的模型使用不正确的权限创建了问题。

标签: apache-spark predictionio


【解决方案1】:

step-deploy 依赖于 step-train 的数据写入 file://tmp 或 hdfs://tmp。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2018-06-25
    • 1970-01-01
    • 2018-06-03
    • 2016-03-17
    相关资源
    最近更新 更多