使用 Tornado 4 和 Gunicorn 设置新遗物时工作人员超时答案

【问题标题】：Worker timeouts when setting up New relic with Tornado 4 and Gunicorn使用 Tornado 4 和 Gunicorn 设置新遗物时工作人员超时
【发布时间】：2017-06-01 01:45:00
【问题描述】：

我正在尝试使用我的 gunicorn + tornado 4 应用程序配置新的遗物。

在本地，没有 gunicorn（并且只是使用 tornado 作为 WSGI 服务器），新的 relic 设置工作，我可以在 new relic 中看到数据。我正在使用以下代码来配置新的遗物代理：

config_file = os.environ.get('NEW_RELIC_CONFIG_FILE', None)
if config_file:
  import newrelic.agent
  environment = 'production' if IS_PROD else 'development'
  newrelic.agent.initialize(config_file, environment=environment)

但是，在生产中，使用 gunicorn，我会得到无限期的工人超时：

gunicorn -b 0.0.0.0:8080 -w 3 -p gunicorn.pid -k tornado --access-logfile /var/log/gunicorn_access.log --error-logfile /var/log/gunicorn_error.log myapp.server:make_application\(\) -t 2 --log-level DEBUG --capture-output &> /dev/null &

...

[2017-01-17 05:16:37 +0000] [26957] [CRITICAL] WORKER TIMEOUT (pid:26985)
[2017-01-17 05:16:37 +0000] [26957] [CRITICAL] WORKER TIMEOUT (pid:26986)
[2017-01-17 05:16:37 +0000] [26957] [CRITICAL] WORKER TIMEOUT (pid:26987)
[2017-01-17 05:16:37 +0000] [26991] [INFO] Booting worker with pid: 26991
[2017-01-17 05:16:37 +0000] [26992] [INFO] Booting worker with pid: 26992
[2017-01-17 05:16:37 +0000] [26993] [INFO] Booting worker with pid: 26993
[2017-01-17 05:16:40 +0000] [26957] [CRITICAL] WORKER TIMEOUT (pid:26992)
[2017-01-17 05:16:40 +0000] [26957] [CRITICAL] WORKER TIMEOUT (pid:26993)
[2017-01-17 05:16:40 +0000] [26957] [CRITICAL] WORKER TIMEOUT (pid:26991)
[2017-01-17 05:16:40 +0000] [26997] [INFO] Booting worker with pid: 26997
[2017-01-17 05:16:40 +0000] [26998] [INFO] Booting worker with pid: 26998
[2017-01-17 05:16:40 +0000] [26999] [INFO] Booting worker with pid: 26999

如果我注释掉上面的代理代码并运行 gunicorn 命令，worker 是稳定的并且不会超时。

尽管将日志级别设置为 DEBUG，但我找不到 gunicorn 工作人员超时并无限期重启的根本原因。我只知道上面新的遗物代理代码是罪魁祸首。

由于我能够在本地成功与 New Relic 集成，我怀疑我的 newrelic.ini 和上面的 new relic 代理代码没问题。 Gunicorn 不知何故把事情搞砸了，但现在确定我应该如何或从哪里开始排除故障。

我正在使用：

newrelic==2.78.0.57
gunicorn==19.6.0
tornado==4.4

【问题讨论】：

标签： tornado gunicorn newrelic

【解决方案1】：

哇，结果是内存问题。当我产生 1 个工人而不是 3 个时，一切正常。新的遗物仪器几乎没有让我的内存使用量超过边缘。

【讨论】：