【发布时间】:2011-07-24 01:48:01
【问题描述】:
我正在尝试了解如何通过上帝来监视 travis-ci 的 resque 工作程序,以使通过上帝停止 resque 监视不会留下陈旧的工作程序进程。
下面我说的是工作进程,而不是分叉的作业子进程(即队列一直是空的)。
当我像这样手动启动 resque worker 时:
$ QUEUE=builds rake resque:work
我会得到一个进程:
$ ps x | grep resque
7041 s001 S+ 0:05.04 resque-1.13.0: Waiting for builds
一旦我停止工作任务,这个过程就会消失。
但是当我用上帝(exact configuration is here,基本上和resque/god example一样的东西)开始同样的事情时......
$ RAILS_ENV=development god -c config/resque.god -D
I [2011-03-27 22:49:15] INFO: Loading config/resque.god
I [2011-03-27 22:49:15] INFO: Syslog enabled.
I [2011-03-27 22:49:15] INFO: Using pid file directory: /Volumes/Users/sven/.god/pids
I [2011-03-27 22:49:15] INFO: Started on drbunix:///tmp/god.17165.sock
I [2011-03-27 22:49:15] INFO: resque-0 move 'unmonitored' to 'init'
I [2011-03-27 22:49:15] INFO: resque-0 moved 'unmonitored' to 'init'
I [2011-03-27 22:49:15] INFO: resque-0 [trigger] process is not running (ProcessRunning)
I [2011-03-27 22:49:15] INFO: resque-0 move 'init' to 'start'
I [2011-03-27 22:49:15] INFO: resque-0 start: cd /Volumes/Users/sven/Development/projects/travis && rake resque:work
I [2011-03-27 22:49:15] INFO: resque-0 moved 'init' to 'start'
I [2011-03-27 22:49:15] INFO: resque-0 [trigger] process is running (ProcessRunning)
I [2011-03-27 22:49:15] INFO: resque-0 move 'start' to 'up'
I [2011-03-27 22:49:15] INFO: resque-0 moved 'start' to 'up'
I [2011-03-27 22:49:15] INFO: resque-0 [ok] memory within bounds [784kb] (MemoryUsage)
I [2011-03-27 22:49:15] INFO: resque-0 [ok] process is running (ProcessRunning)
I [2011-03-27 22:49:45] INFO: resque-0 [ok] memory within bounds [784kb, 784kb] (MemoryUsage)
I [2011-03-27 22:49:45] INFO: resque-0 [ok] process is running (ProcessRunning)
那我就多出一个流程:
$ ps x | grep resque
7187 ?? Ss 0:00.02 sh -c cd /Volumes/Users/sven/Development/projects/travis && rake resque:work
7188 ?? S 0:05.11 resque-1.13.0: Waiting for builds
7183 s001 S+ 0:01.18 /Volumes/Users/sven/.rvm/rubies/ruby-1.8.7-p302/bin/ruby /Volumes/Users/sven/.rvm/gems/ruby-1.8.7-p302/bin/god -c config/resque.god -D
上帝似乎只记录了第一个的pid:
$ cat ~/.god/pids/resque-0.pid
7187
然后当我通过上帝停止 resque 手表时:
$ god stop resque
Sending 'stop' command
The following watches were affected:
resque-0
上帝给出了这个日志输出:
I [2011-03-27 22:51:22] INFO: resque-0 stop: default lambda killer
I [2011-03-27 22:51:22] INFO: resque-0 sent SIGTERM
I [2011-03-27 22:51:23] INFO: resque-0 process stopped
I [2011-03-27 22:51:23] INFO: resque-0 move 'up' to 'unmonitored'
I [2011-03-27 22:51:23] INFO: resque-0 moved 'up' to 'unmonitored'
但它实际上并没有终止这两个进程,而是让实际的工作进程保持活动状态:
$ ps x | grep resque
6864 ?? S 0:05.15 resque-1.13.0: Waiting for builds
6858 s001 S+ 0:01.36 /Volumes/Users/sven/.rvm/rubies/ruby-1.8.7-p302/bin/ruby /Volumes/Users/sven/.rvm/gems/ruby-1.8.7-p302/bin/god -c config/resque.god -D
【问题讨论】:
-
上帝总是会产生僵尸,对此你无能为力。
-
嗯,如果这是真的,那么
god stop命令的意义何在? -
@svenfuchs:有时答案是否定的。