【发布时间】:2021-06-03 16:47:36
【问题描述】:
我在运行 Ubuntu 16.04 的 docker 容器中使用 Zeppelin 0.8.2
长时间运行的 pyspark 段落在 1 小时后全部停止,并显示如下消息:
Took 1 hrs 0 min 0 sec. Last updated by anonymous at February 09 2021, 10:24:25 PM.
我已确认 spark 仍在后台运行我的代码,因此与 spark 的 zeppelin 连接似乎在 1 小时标记处被切断。
我尝试根据documentation 在zeppelin-site.xml 中编辑zeppelin.interpreter.lifecyclemanager.timeout.threshold 变量,但这对问题没有影响,即使更改在Web ui (screenshot) 中清晰可见。其他值在 zeppelin 读取和执行的 xml 文件中设置。我还验证了zeppelin-env.sh 中定义的变量与 xml 文件中设置的值没有冲突。
有时段落输出会包含以下错误消息:
org.apache.thrift.transport.TTransportException
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_interpret(RemoteInterpreterService.java:274)
at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.interpret(RemoteInterpreterService.java:258)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter$4.call(RemoteInterpreter.java:233)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter$4.call(RemoteInterpreter.java:229)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.callRemoteFunction(RemoteInterpreterProcess.java:135)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:228)
at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:449)
at org.apache.zeppelin.scheduler.Job.run(Job.java:188)
at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:315)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
我也在官方的 zeppelin 问题跟踪器上发布了这个 https://issues.apache.org/jira/browse/ZEPPELIN-5279
【问题讨论】:
标签: apache-spark pyspark apache-zeppelin