【问题标题】:CRITICAL WORKER TIMEOUT error on gunicorn djangogunicorn django 上的 CRITICAL WORKER TIMEOUT 错误
【发布时间】:2018-04-23 09:26:36
【问题描述】:

我正在尝试保存 word2vec 模型并保存它,然后基于该模式创建一些集群,它在本地运行良好,但是当我创建 docker 映像并使用 gunicorn 运行时,它总是给我超时错误,我尝试了描述的解决方案here 但对我没有用

我正在使用

python 3.5
gunicorn 19.7.1
gevent 1.2.2
eventlet 0.21.0

这是我的 gunicorn.conf 文件

#!/bin/bash

# Start Gunicorn processes
echo Starting Gunicorn.
exec gunicorn ReviewsAI.wsgi:application \
    --bind 0.0.0.0:8000 \
    --worker-class eventlet
    --workers 1
    --timeout 300000
    --graceful-timeout 300000
    --keep-alive 300000

我也尝试了gevent,sync 的工人阶级,但没有奏效。谁能告诉我为什么这个超时错误不断发生。谢谢

这是我的日志

Starting Gunicorn.
[2017-11-10 06:03:45 +0000] [1] [INFO] Starting gunicorn 19.7.1
[2017-11-10 06:03:45 +0000] [1] [INFO] Listening at: http://0.0.0.0:8000 (1)
[2017-11-10 06:03:45 +0000] [1] [INFO] Using worker: eventlet
[2017-11-10 06:03:45 +0000] [8] [INFO] Booting worker with pid: 8
2017-11-10 06:05:00,307 : INFO : collecting all words and their counts
2017-11-10 06:05:00,309 : INFO : PROGRESS: at sentence #0, processed 0 words, keeping 0 word types
2017-11-10 06:05:00,737 : INFO : collected 11927 word types from a corpus of 1254665 raw words and 126 sentences
2017-11-10 06:05:00,738 : INFO : Loading a fresh vocabulary
2017-11-10 06:05:00,916 : INFO : min_count=1 retains 11927 unique words (100% of original 11927, drops 0)
2017-11-10 06:05:00,917 : INFO : min_count=1 leaves 1254665 word corpus (100% of original 1254665, drops 0)
2017-11-10 06:05:00,955 : INFO : deleting the raw counts dictionary of 11927 items
2017-11-10 06:05:00,957 : INFO : sample=0.001 downsamples 59 most-common words
2017-11-10 06:05:00,957 : INFO : downsampling leaves estimated 849684 word corpus (67.7% of prior 1254665)
2017-11-10 06:05:00,957 : INFO : estimated required memory for 11927 words and 200 dimensions: 25046700 bytes
2017-11-10 06:05:01,002 : INFO : resetting layer weights
2017-11-10 06:05:01,242 : INFO : training model with 4 workers on 11927 vocabulary and 200 features, using sg=0 hs=0 sample=0.001 negative=5 window=4
2017-11-10 06:05:02,294 : INFO : PROGRESS: at 6.03% examples, 247941 words/s, in_qsize 0, out_qsize 7
2017-11-10 06:05:03,423 : INFO : PROGRESS: at 13.65% examples, 269423 words/s, in_qsize 0, out_qsize 7
2017-11-10 06:05:04,670 : INFO : PROGRESS: at 23.02% examples, 286330 words/s, in_qsize 8, out_qsize 11
2017-11-10 06:05:05,745 : INFO : PROGRESS: at 32.70% examples, 310218 words/s, in_qsize 0, out_qsize 7
2017-11-10 06:05:07,054 : INFO : PROGRESS: at 42.06% examples, 308128 words/s, in_qsize 8, out_qsize 11
2017-11-10 06:05:08,123 : INFO : PROGRESS: at 51.75% examples, 320675 words/s, in_qsize 0, out_qsize 7
2017-11-10 06:05:09,355 : INFO : PROGRESS: at 61.11% examples, 320556 words/s, in_qsize 8, out_qsize 11
2017-11-10 06:05:10,436 : INFO : PROGRESS: at 70.79% examples, 328012 words/s, in_qsize 0, out_qsize 7
2017-11-10 06:05:11,663 : INFO : PROGRESS: at 80.16% examples, 327237 words/s, in_qsize 8, out_qsize 11
2017-11-10 06:05:12,752 : INFO : PROGRESS: at 89.84% examples, 332298 words/s, in_qsize 0, out_qsize 7
2017-11-10 06:05:13,784 : INFO : PROGRESS: at 99.21% examples, 336724 words/s, in_qsize 0, out_qsize 9
2017-11-10 06:05:13,784 : INFO : worker thread finished; awaiting finish of 3 more threads
2017-11-10 06:05:13,784 : INFO : worker thread finished; awaiting finish of 2 more threads
2017-11-10 06:05:13,784 : INFO : worker thread finished; awaiting finish of 1 more threads
2017-11-10 06:05:13,784 : INFO : worker thread finished; awaiting finish of 0 more threads
2017-11-10 06:05:13,784 : INFO : training on 6273325 raw words (4248672 effective words) took 12.5s, 339100 effective words/s
2017-11-10 06:05:13,785 : INFO : saving Word2Vec object under trained_models/mobile, separately None
2017-11-10 06:05:13,785 : INFO : not storing attribute syn0norm
2017-11-10 06:05:13,785 : INFO : not storing attribute cum_table
2017-11-10 06:05:14,026 : INFO : saved trained_models/mobile
[2017-11-10 06:05:43 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:8)
2017-11-10 06:05:43,712 : INFO : precomputing L2-norms of word weight vectors
[2017-11-10 06:05:44 +0000] [14] [INFO] Booting worker with pid: 14

【问题讨论】:

  • 在您的“gunicorn.conf”文件中,您是否为exec gunicorn ... 参数的续行添加了反斜杠?它们仅出现在您发布的 sn-p 的前两行中。
  • 这取决于训练模型需要多长时间。您可以尝试增加超时时间。

标签: python django docker gunicorn


【解决方案1】:

我遇到了类似的问题。它解决了我将 gunicorn 的版本更新到 19.9.0

gunicorn 19.9.0

对于可能遇到相同问题的其他人 - 确保添加超时。我个人使用

gunicorn app.wsgi:application -w 2 -b :8000 --timeout 120

【讨论】:

  • --timeout 120 标记为我解决了问题的原因。非常感谢!!
  • 超时有什么作用?我在使用 Python Plotly Dash 应用程序的 Heroku 上遇到了一些类似的问题。
  • docs.gunicorn.org/en/stable/settings.html#timeout 默认值: 30 沉默超过这么多秒的工人被杀死并重新启动。值为正数或 0。将其设置为 0 通过完全禁用所有工作人员的超时来产生无限超时的效果。通常,默认值 30 秒就足够了。如果您确定对同步工作人员的影响,请仅将其设置得更高。对于非同步工作者,它只是意味着工作者进程仍在通信,并且与处理单个请求所需的时间长度无关。
  • 你救了我!谢谢!
猜你喜欢
  • 2020-01-07
  • 2020-03-29
  • 2021-03-07
  • 1970-01-01
  • 2012-06-06
  • 2017-03-28
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多