【发布时间】:2019-09-03 00:08:34
【问题描述】:
我无法使用驱动程序类路径中所需的依赖 JAR 创建生动的交互式会话,使用以下命令:
curl -H "Content-Type: application/json" -X POST -d '{"kind":"pyspark","conf":{"spark.driver.extraClassPath":"/data/XXX-0.0.1-SNAPSHOT.jar"}}' -i http://<LIVY_SERVER_IP:PORT>/sessions
此处的 JAR 文件存在于本地驱动程序路径中。还尝试通过以下方式使用 HDFS 路径hdfs://<NM_IP>:<NM_Port>/data/XXX-0.0.1-SNAPSHOT.jar
以下是尝试创建交互式会话时的实时服务器日志
19/08/12 17:26:56 INFO sessions.InteractiveSessionManager: Registering new session 0
19/08/12 17:26:58 INFO utils.LineBufferedStream: Exception in thread "main" java.lang.NullPointerException
19/08/12 17:26:58 INFO utils.LineBufferedStream: at org.apache.livy.rsc.driver.JobWrapper.cancel(JobWrapper.java:90)
19/08/12 17:26:58 INFO utils.LineBufferedStream: at org.apache.livy.rsc.driver.RSCDriver.shutdown(RSCDriver.java:127)
19/08/12 17:26:58 INFO utils.LineBufferedStream: at org.apache.livy.rsc.driver.RSCDriver.run(RSCDriver.java:356)
19/08/12 17:26:58 INFO utils.LineBufferedStream: at org.apache.livy.rsc.driver.RSCDriverBootstrapper.main(RSCDriverBootstrapper.java:93)
19/08/12 17:26:58 INFO utils.LineBufferedStream: at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
19/08/12 17:26:58 INFO utils.LineBufferedStream: at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
19/08/12 17:26:58 INFO utils.LineBufferedStream: at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
19/08/12 17:26:58 INFO utils.LineBufferedStream: at java.lang.reflect.Method.invoke(Method.java:498)
19/08/12 17:26:58 INFO utils.LineBufferedStream: at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
19/08/12 17:26:58 INFO utils.LineBufferedStream: at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849)
19/08/12 17:26:58 INFO utils.LineBufferedStream: at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
19/08/12 17:26:58 INFO utils.LineBufferedStream: at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
19/08/12 17:26:58 INFO utils.LineBufferedStream: at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
19/08/12 17:26:58 INFO utils.LineBufferedStream: at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
19/08/12 17:26:58 INFO utils.LineBufferedStream: at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
19/08/12 17:26:58 INFO utils.LineBufferedStream: at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
但是,当使用相同的 JAR 运行带有批处理 livy 会话的 python 作业时,它会成功完成。下面提到的是命令:
curl -H "Content-Type: application/json" -X POST --data '{"file": "/data/test.py", "conf": {"spark.driver.extraClassPath":"/data/XXX-0.0.1-SNAPSHOT.jar"}}' http://<LIVY_SERVER_IP:PORT>/batches
python 文件 /data/test.py 存在于 HDFS 路径上。
我已尝试通过修改livy.conf 文件将目录路径列入白名单,并进行以下更改livy.file.local-dir-whitelist = /data/
【问题讨论】:
标签: livy