【发布时间】:2014-01-26 22:21:59
【问题描述】:
我们最近开始遇到厨师客户在运行过程中死亡的问题,因为我们在运行列表的各个部分花费了更多时间,而这些部分通常进行得更快。我一直在使用家庭 wifi,而我的同事一直在使用工作 wifi,这本身就存在一些连接问题。
如果在 chef-client 运行时您的 ssh 连接中断了与机器的连接,这是否会以看似莫名其妙的方式使运行崩溃?我正在使用 PutTY 从我的 Win7 进行连接,而我的同事正在使用 Apple 终端应用程序。
我们一直在运行它的所有机器都是 Ubuntu 12.04(在 EC2 中)并且有大量剩余的磁盘空间 - 它们只使用了大约 1GB 和大约 5GB 的空闲空间。
这是来自/var/log/chef/client.log 的日志输出(使用/etc/chef/client.rb 中的log_location 指令设置为described here)。
[2014-01-08T00:27:07+00:00] WARN: Nodejs user is nodejs
[2014-01-08T00:27:07+00:00] WARN: Cloning resource attributes for group[nodejs] from prior resource (CHEF-3694)
[2014-01-08T00:27:07+00:00] WARN: Previous group[nodejs]: /var/chef/cache/cookbooks/nodejs/recipes/default.rb:26:in `from_file'
[2014-01-08T00:27:07+00:00] WARN: Current group[nodejs]: /var/chef/cache/cookbooks/spicoli-app/recipes/default.rb:38:in `from_file'
[2014-01-08T00:27:07+00:00] WARN: Cloning resource attributes for user[nodejs] from prior resource (CHEF-3694)
[2014-01-08T00:27:07+00:00] WARN: Previous user[nodejs]: /var/chef/cache/cookbooks/nodejs/recipes/default.rb:34:in `from_file'
[2014-01-08T00:27:07+00:00] WARN: Current user[nodejs]: /var/chef/cache/cookbooks/spicoli-app/recipes/default.rb:46:in `from_file'
[2014-01-08T00:27:30+00:00] WARN: Environment is _default
[2014-01-08T00:27:30+00:00] WARN: Nodejs user is nodejs
[2014-01-08T02:04:54+00:00] ERROR: Running exception handlers
[2014-01-08T02:04:54+00:00] ERROR: Exception handlers complete
[2014-01-08T02:04:54+00:00] FATAL: Stacktrace dumped to /var/chef/cache/chef-stacktrace.out
[2014-01-08T02:04:55+00:00] ERROR: Input/output error - <STDOUT>
[2014-01-08T02:04:57+00:00] FATAL: Chef::Exceptions::ChildConvergeError: Chef run process exited unsuccessfully (exit code 1)
错误堆栈跟踪只有这个:
Generated at 2014-01-08 02:04:54 +0000
Errno::EIO: Input/output error - <STDOUT>
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/base.rb:91:in `write'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/base.rb:91:in `puts'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/base.rb:91:in `puts'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/error_descriptor.rb:61:in `display_section'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/error_descriptor.rb:44:in `block (2 levels) in display'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/error_descriptor.rb:43:in `each'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/error_descriptor.rb:43:in `block in display'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/error_descriptor.rb:42:in `each'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/error_descriptor.rb:42:in `display'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/base.rb:130:in `display_error'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/base.rb:161:in `resource_failed'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/doc.rb:159:in `resource_failed'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/event_dispatch/dispatcher.rb:29:in `block in resource_failed'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/event_dispatch/dispatcher.rb:29:in `each'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/event_dispatch/dispatcher.rb:29:in `resource_failed'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/resource.rb:637:in `rescue in run_action'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/resource.rb:643:in `run_action'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/runner.rb:49:in `run_action'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/runner.rb:81:in `block (2 levels) in converge'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/runner.rb:81:in `each'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/runner.rb:81:in `block in converge'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/resource_collection.rb:98:in `block in execute_each_resource'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/resource_collection/stepable_iterator.rb:116:in `call'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/resource_collection/stepable_iterator.rb:116:in `call_iterator_block'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/resource_collection/stepable_iterator.rb:85:in `step'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/resource_collection/stepable_iterator.rb:104:in `iterate'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/resource_collection/stepable_iterator.rb:55:in `each_with_index'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/resource_collection.rb:96:in `execute_each_resource'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/runner.rb:80:in `converge'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/client.rb:433:in `converge'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/client.rb:500:in `do_run'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/client.rb:199:in `block in run'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/client.rb:193:in `fork'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/client.rb:193:in `run'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/application.rb:208:in `run_chef_client'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/application/client.rb:312:in `block in run_application'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/application/client.rb:304:in `loop'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/application/client.rb:304:in `run_application'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/application.rb:66:in `run'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/bin/chef-client:26:in `<top (required)>'
/usr/bin/chef-client:23:in `load'
/usr/bin/chef-client:23:in `<main>'
这是一个非常普遍的错误!但它似乎确实表明 STDOUT 输出中断,这对于客户端断开是有意义的。
编辑:根据要求,这里是client.rb 文件的内容(名称自然是混淆了。)
$ cat /etc/chef/client.rb
log_level :auto
log_location "/var/log/chef/client.log"
chef_server_url "https://api.opscode.com/organizations/myapp"
validation_client_name "my-validator"
node_name "my-app-node"
编辑2:尝试使用sudo su -s /bin/bash root -c "screen chef-client"
屏幕在我吃午饭时终止,并记录了 ShellOut 命令对 npm install 的超时。这是在厨师客户在此操作上停留了一个多小时之后。
[2014-01-09T16:39:07+00:00] WARN: Environment is _default
[2014-01-09T16:39:07+00:00] WARN: Nodejs user is nodejs
[2014-01-09T18:16:28+00:00] ERROR: Running exception handlers
[2014-01-09T18:16:28+00:00] ERROR: Exception handlers complete
[2014-01-09T18:16:28+00:00] FATAL: Stacktrace dumped to /var/chef/cache/chef-stacktrace.out
[2014-01-09T18:16:31+00:00] ERROR: execute[npm-install-app] (spicoli-app::default line 110) had an error: Mixlib::ShellOut::CommandTimeout: command timed out:
---- Begin output of npm --registry http://my.npm.repo.amazonaws.com:5984/registry/_design/app/_rewrite install --cache /home/nodejs/.npm --tmp /home/nodejs/tmp
--- snip: install messages from npm ---
[2014-01-09T18:16:33+00:00] FATAL: Chef::Exceptions::ChildConvergeError: Chef run process exited unsuccessfully (exit code 1)
这是一个与以前完全不同的错误。 stacktrace.out 文件也明确提到了ShellOut,所以它也完全不同。最奇怪的是,当我从命令行运行相同的 npm 命令时,不到一分钟就完成了。
所以我不确定是否有办法进一步诊断之前的故障,但我欢迎其他建议。对于这个新的失败,我询问了this followup question。
【问题讨论】:
-
当您在
screen中运行chef-client或作为服务运行时会发生什么? -
请发布您的client.rb
-
@StephenKing 我再次使用命令
sudo su -s /bin/bash root -c "screen chef-client"运行,稍后我会告诉你这是怎么回事。感谢您的建议!
标签: ssh chef-infra remote-access remote-server