【发布时间】:2013-08-31 15:50:44
【问题描述】:
我正在使用 monit 来监控几个自定义 Rails 守护进程。他们在树莓派上缓慢启动(这并不奇怪)。但是 monit summary 命令在waiting 和execution failed 之间不断交替,即使日志显示守护程序正在运行,警报电子邮件也是如此。守护程序不会连续重新启动。
我的监控配置文件看起来像
check process setpoint_manager with pidfile /opt/thermyos.com/server/current/tmp/pids/setpoint_manager.pid every 2 cycles
start program = "/etc/init.d/setpoint_manager start" as uid thermyos and gid thermyos
stop program = "/etc/init.d/setpoint_manager stop"
if 5 restarts within 5 cycles then timeout
monit 守护程序循环时间为 60 秒。日志文件显示
[EDT Aug 30 17:38:35] info : 'setpoint_manager' process is running with pid 2984
监控邮件说
Exists Service setpoint_manager
Date: Fri, 30 Aug 2013 17:38:35
Action: alert
Host: thermdev
Description: process is running with pid 2984
我已验证 pid 文件和 ps ax 匹配。如果我通过 monit 重新启动守护程序,状态将变为正确。
为什么监控状态不能自我纠正?
【问题讨论】:
标签: ruby-on-rails status monit