【问题标题】:monit status is failed but process works fine监控状态失败,但过程正常
【发布时间】:2016-07-09 17:09:57
【问题描述】:

我正在尝试使用 bosh 版本部署 gunicorn。它随机失败。有时它工作正常,有时它失败。

监控摘要

Process 'gunicorn'                  Execution failed
Process 'nginx'                     running
Process 'consul'                    running

监控日志是

[UTC Jul  9 11:56:06] error    : 'gunicorn' process is not running
[UTC Jul  9 11:56:06] info     : 'gunicorn' trying to restart
[UTC Jul  9 11:56:06] info     : 'gunicorn' start: /var/vcap/jobs/gunicorn/bin/monit_debugger
[UTC Jul  9 11:56:11] info     : start service 'consul' on user request
[UTC Jul  9 11:56:11] info     : monit daemon at 1383 awakened
[UTC Jul  9 11:56:11] info     : start service 'nginx' on user request
[UTC Jul  9 11:56:11] info     : monit daemon at 1383 awakened
[UTC Jul  9 11:56:11] info     : start service 'gunicorn' on user request
[UTC Jul  9 11:56:11] info     : monit daemon at 1383 awakened
[UTC Jul  9 11:56:36] error    : 'gunicorn' failed to start
[UTC Jul  9 11:56:36] info     : 'nginx' start: /var/vcap/jobs/nginx/bin/monit_debugger
[UTC Jul  9 11:56:37] info     : 'nginx' start action done
[UTC Jul  9 11:56:37] info     : 'consul' start: /var/vcap/jobs/consul/bin/monit_debugger
[UTC Jul  9 11:56:38] info     : 'consul' start action done
[UTC Jul  9 11:56:38] info     : Awakened by User defined signal 1
[UTC Jul  9 11:56:38] info     : 'gunicorn' start: /var/vcap/jobs/gunicorn/bin/monit_debugger
[UTC Jul  9 11:57:08] error    : 'gunicorn' failed to start
[UTC Jul  9 11:57:08] info     : 'gunicorn' start action done
[UTC Jul  9 11:57:18] error    : 'gunicorn' process is not running
[UTC Jul  9 11:57:18] info     : 'gunicorn' trying to restart
[UTC Jul  9 11:57:18] info     : 'gunicorn' start: /var/vcap/jobs/gunicorn/bin/monit_debugger
[UTC Jul  9 11:57:48] error    : 'gunicorn' failed to start
[UTC Jul  9 11:57:58] error    : 'gunicorn' process is not running
[UTC Jul  9 11:57:58] info     : 'gunicorn' trying to restart
[UTC Jul  9 11:57:58] info     : 'gunicorn' start: /var/vcap/jobs/gunicorn/bin/monit_debugger
[UTC Jul  9 11:58:28] error    : 'gunicorn' failed to start
[UTC Jul  9 11:58:38] error    : 'gunicorn' process is not running
[UTC Jul  9 11:58:38] info     : 'gunicorn' trying to restart
[UTC Jul  9 11:58:38] info     : 'gunicorn' start: /var/vcap/jobs/gunicorn/bin/monit_debugger
[UTC Jul  9 11:59:08] error    : 'gunicorn' failed to start
[UTC Jul  9 11:59:18] info     : 'gunicorn' process is running with pid 5670

过程也正常
ps -ef

root      5670     1  0 11:59 ?        00:00:02 /usr/bin/python /usr/local/bin/gunicorn --workers 3 --bind 0.0.0.0:8000 idmapi.wsgi:application
root      5682  5670  0 11:59 ?        00:00:00 /usr/bin/python /usr/local/bin/gunicorn --workers 3 --bind 0.0.0.0:8000 idmapi.wsgi:application
root      5685  5670  0 11:59 ?        00:00:00 /usr/bin/python /usr/local/bin/gunicorn --workers 3 --bind 0.0.0.0:8000 idmapi.wsgi:application
root      5686  5670  0 11:59 ?        00:00:00 /usr/bin/python /usr/local/bin/gunicorn --workers 3 --bind 0.0.0.0:8000 idmapi.wsgi:application

这是随机发生的

当 gunicorn 成功时,我会得到以下日志

[UTC Jul  8 22:32:31] error    : 'gunicorn' process is not running
[UTC Jul  8 22:32:31] info     : 'gunicorn' trying to restart
[UTC Jul  8 22:32:31] info     : 'gunicorn' start: /var/vcap/jobs/gunicorn/bin/monit_debugger
[UTC Jul  8 22:32:36] info     : start service 'consul' on user request
[UTC Jul  8 22:32:36] info     : monit daemon at 1375 awakened
[UTC Jul  8 22:32:36] info     : start service 'nginx' on user request
[UTC Jul  8 22:32:36] info     : monit daemon at 1375 awakened
[UTC Jul  8 22:32:36] info     : start service 'gunicorn' on user request
[UTC Jul  8 22:32:36] info     : monit daemon at 1375 awakened
[UTC Jul  8 22:33:01] error    : 'gunicorn' failed to start
[UTC Jul  8 22:33:01] info     : 'nginx' start: /var/vcap/jobs/nginx/bin/monit_debugger
[UTC Jul  8 22:33:02] info     : 'nginx' start action done
[UTC Jul  8 22:33:02] info     : 'consul' start: /var/vcap/jobs/consul/bin/monit_debugger
[UTC Jul  8 22:33:03] info     : 'consul' start action done
[UTC Jul  8 22:33:03] info     : Awakened by User defined signal 1
[UTC Jul  8 22:33:03] info     : 'gunicorn' start: /var/vcap/jobs/gunicorn/bin/monit_debugger
[UTC Jul  8 22:33:33] error    : 'gunicorn' failed to start
[UTC Jul  8 22:33:33] info     : 'gunicorn' start action done
[UTC Jul  8 22:33:43] error    : 'gunicorn' process is not running
[UTC Jul  8 22:33:43] info     : 'gunicorn' trying to restart
[UTC Jul  8 22:33:43] info     : 'gunicorn' start: /var/vcap/jobs/gunicorn/bin/monit_debugger
[UTC Jul  8 22:34:13] error    : 'gunicorn' failed to start
[UTC Jul  8 22:34:23] error    : 'gunicorn' process is not running
[UTC Jul  8 22:34:23] info     : 'gunicorn' trying to restart
[UTC Jul  8 22:34:23] info     : 'gunicorn' start: /var/vcap/jobs/gunicorn/bin/monit_debugger
[UTC Jul  8 22:34:53] error    : 'gunicorn' failed to start
[UTC Jul  8 22:35:03] error    : 'gunicorn' process is not running
[UTC Jul  8 22:35:03] info     : 'gunicorn' trying to restart
[UTC Jul  8 22:35:03] info     : 'gunicorn' start: /var/vcap/jobs/gunicorn/bin/monit_debugger
[UTC Jul  8 22:35:33] error    : 'gunicorn' failed to start
[UTC Jul  8 22:35:43] error    : 'gunicorn' process is not running
[UTC Jul  8 22:35:43] info     : 'gunicorn' trying to restart
[UTC Jul  8 22:35:43] info     : 'gunicorn' start: /var/vcap/jobs/gunicorn/bin/monit_debugger
[UTC Jul  8 22:36:13] error    : 'gunicorn' failed to start
[UTC Jul  8 22:36:23] error    : 'gunicorn' process is not running
[UTC Jul  8 22:36:23] info     : 'gunicorn' trying to restart
[UTC Jul  8 22:36:23] info     : 'gunicorn' start: /var/vcap/jobs/gunicorn/bin/monit_debugger
[UTC Jul  8 22:36:25] info     : 'gunicorn' started
[UTC Jul  8 22:36:35] info     : 'gunicorn' process is running with pid 5780

更新

    check process gunicorn
  with pidfile /var/vcap/sys/run/gunicorn/gunicorn.pid
  start program "/var/vcap/jobs/gunicorn/bin/monit_debugger gunicorn_ctl '/var/vcap/jobs/gunicorn/bin/gunicorn_ctl start'"
  stop program "/var/vcap/jobs/gunicorn/bin/monit_debugger gunicorn_ctl '/var/vcap/jobs/gunicorn/bin/gunicorn_ctl stop'"
  group vcap

【问题讨论】:

  • 您能否提供您正在使用的 bosh 版本的链接? monit 文件中定义了gunicorn 进程的内容是什么?
  • 我已经创建了它 mnaully .thi bosh 版本在线不可用。我已经更新了问题
  • 您应该检查当进程正在运行并且 monit 说它失败时 /var/vcap/sys/run/gunicorn/gunicorn.pid 中的进程 id 与正在运行的进程本身的 id 匹配。

标签: monit


【解决方案1】:

您可能想检查您的进程的pid 是否存在。 一般存放在/var/run/文件夹中。 如果pid 文件丢失,您应该手动终止并启动该进程。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2021-06-25
    • 2020-05-11
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多