【问题标题】:LivyClient uploadJar failing with py4j.Py4JException: Error while obtaining a new communication channelLivyClient uploadJar 失败并出现 py4j.Py4JException:获取新通信通道时出错
【发布时间】:2019-10-17 16:15:39
【问题描述】:

我正在尝试通过 Apache Livy 提交 Spark 作业,但 LivyClient 的 uploadJar 方法失败。

这是代码(与 PiJob 示例非常相似):

        LivyClientBuilder builder = new LivyClientBuilder();
        LivyClient client = builder.setURI(new URI("http://server:8998")).build();
        client.uploadJar(new File("/path/to/file")).get();

这是完整的堆栈跟踪:

py4j.Py4JException: Error while obtaining a new communication channel
    at py4j.CallbackClient.getConnectionLock(CallbackClient.java:257)
    at py4j.CallbackClient.sendCommand(CallbackClient.java:377)
    at py4j.CallbackClient.sendCommand(CallbackClient.java:356)
    at py4j.reflection.PythonProxyHandler.invoke(PythonProxyHandler.java:106)
    at com.sun.proxy.$Proxy24.getLocalTmpDirPath(Unknown Source)
    at org.apache.livy.repl.PythonInterpreter.addPyFile(PythonInterpreter.scala:294)
    at org.apache.livy.repl.ReplDriver$$anonfun$addJarOrPyFile$1.apply(ReplDriver.scala:114)
    at org.apache.livy.repl.ReplDriver$$anonfun$addJarOrPyFile$1.apply(ReplDriver.scala:114)
    at scala.Option.foreach(Option.scala:257)
    at org.apache.livy.repl.ReplDriver.addJarOrPyFile(ReplDriver.scala:114)
    at org.apache.livy.rsc.driver.JobContextImpl.addJarOrPyFile(JobContextImpl.java:151)
    at org.apache.livy.rsc.driver.AddJarJob.call(AddJarJob.java:39)
    at org.apache.livy.rsc.driver.JobWrapper.call(JobWrapper.java:64)
    at org.apache.livy.rsc.driver.JobWrapper.call(JobWrapper.java:31)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.ConnectException: Connection refused (Connection refused)
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:589)
    at java.net.Socket.connect(Socket.java:538)
    at java.net.Socket.<init>(Socket.java:434)
    at java.net.Socket.<init>(Socket.java:244)
    at javax.net.DefaultSocketFactory.createSocket(SocketFactory.java:277)
    at py4j.CallbackConnection.start(CallbackConnection.java:226)
    at py4j.CallbackClient.getConnection(CallbackClient.java:238)
    at py4j.CallbackClient.getConnectionLock(CallbackClient.java:250)
    ... 17 more

我可以通过 REST API 向 Livy 服务器提交代码 sn-ps,它们运行良好。 Livy/Spark 是在 YARN 上设置的,我尝试过客户端和集群模式。有什么想法吗?

【问题讨论】:

  • 升级到 Livy 0.5.0 客户端后我也面临同样的问题。有什么解决办法吗?

标签: apache-spark hadoop livy


【解决方案1】:

从您的代码看来,您的 jar 路径似乎没有设置,请尝试为您的 jar 设置路径。

示例

client.uploadJar(new File("tmp/myjar.jar")).get();

您还可以检查 pyspark 环境的 env 路径,确保节点在同一环境中通信。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2016-12-25
    • 1970-01-01
    • 2022-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多