【问题标题】:azkaban run selenium automatic python script flow failed when after execute about twenty miniutes,and the system becomes very slowlyazkaban 运行 selenium 自动化 python 脚本流在执行大约二十分钟后失败,系统变得非常缓慢
【发布时间】:2020-07-28 12:50:54
【问题描述】:

我在 azkaban 中运行 python 脚本。

enviroment:
CentOS 8.1 
azkaban 3.90.0
Python 3.6.8
ChromeDriver84.0.4147.30

在 test.flow 文件中

nodes:
  - name: job_test
    type: command
    config:
      command: python3 /home/azkaban/python_codes/pyib/activity/pickgoods.py

在run执行这个流程大约二十分钟后,系统变得非常慢,执行失败。

28-07-2020 18:30:40 CST job_test INFO - Process with id 1403 completed unsuccessfully in 1727 seconds.
28-07-2020 18:30:40 CST job_test ERROR - Job run failed!
java.lang.RuntimeException: azkaban.jobExecutor.utils.process.ProcessFailureException: Process exited with code 1
    at azkaban.jobExecutor.ProcessJob.run(ProcessJob.java:312)
    at azkaban.execapp.JobRunner.runJob(JobRunner.java:830)
    at azkaban.execapp.JobRunner.doRun(JobRunner.java:607)
    at azkaban.execapp.JobRunner.run(JobRunner.java:568)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: azkaban.jobExecutor.utils.process.ProcessFailureException: Process exited with code 1
    at azkaban.jobExecutor.utils.process.AzkabanProcess.run(AzkabanProcess.java:125)
    at azkaban.jobExecutor.ProcessJob.run(ProcessJob.java:304)
    ... 8 more
28-07-2020 18:30:40 CST job_test ERROR - azkaban.jobExecutor.utils.process.ProcessFailureException: Process exited with code 1 cause: azkaban.jobExecutor.utils.process.ProcessFailureException: Process exited with code 1
28-07-2020 18:30:40 CST job_test INFO - Finishing job job_test at 1595932240480 with status FAILED

和azkaban-web-server下的azkaban-webserver.log

2020/07/28 21:00:34.127 +0800  INFO [ExecutorManager] [AzkabanWebServer-QueueProcessor-Thread] [Azkaban] Successfully refreshed executor: iZbp1hb3esnbp3levrcg05Z:36037 (id: 16), active=true with executor info : ExecutorInfo{remainingMemoryPercent=45.705342424456234, remainingMemoryInMB=835, remainingFlowCapacity=30, numberOfAssignedFlows=0, lastDispatchedTime=1595936723440, cpuUsage=0.01}
2020/07/28 21:00:34.128 +0800 ERROR [ExecutorManager] [AzkabanWebServer-QueueProcessor-Thread] [Azkaban] Failed to update ExecutorInfo for executor : iZbp1hb3esnbp3levrcg05Z:44085 (id: 17), active=true
java.util.concurrent.ExecutionException: org.apache.http.conn.HttpHostConnectException: Connect to iZbp1hb3esnbp3levrcg05Z:44085 [iZbp1hb3esnbp3levrcg05Z/172.16.184.105] failed: Connection refused (Connection refused)

谁能帮忙解决?

【问题讨论】:

  • 您的作业进程崩溃。某处可能有一些日志输出显示 Python 回溯等。
  • @AKX ,我会在问题描述中添加错误日志。
  • 不,我的意思是查看作业的错误日志:azkaban.readthedocs.io/en/latest/useAzkaban.html#job-logs
  • @AKX,谢谢,根据你的回答,我找到了我的python代码的错误日志。
  • 很高兴我能帮上忙。我将其发布为您可以接受的答案。

标签: python azkaban


【解决方案1】:

您的工作流程崩溃了。您可以在 Web UI 中找到它的错误日志,以便进一步调试;见https://azkaban.readthedocs.io/en/latest/useAzkaban.html#job-logs

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2018-06-17
    • 2015-05-10
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2018-12-29
    • 1970-01-01
    • 2021-06-15
    相关资源
    最近更新 更多