【发布时间】:2021-08-05 15:35:51
【问题描述】:
完全加载的多租户 Django 应用程序具有 1000 个使用 Daphne/Channels 的 WebSockets,运行良好几个月,突然租户都将其称为支持线,应用程序运行缓慢或完全挂起。由于 HTTP REST API 命中快速且无错误,因此将其缩小到 WebSockets。
没有任何应用程序日志或操作系统日志表明存在问题,因此唯一需要处理的是下面提到的异常。它在 2 天内一次又一次地发生。
我不期望任何深入的调试帮助,只是一些关于可能性的即兴建议。
AWS Linux 1
Python 3.6.4
Elasticache Redis 5.0
channels==2.4.0
channels-redis==2.4.2
daphne==2.5.0
Django==2.2.13
拆分配置HTTP服务于uwsgi,daphne服务于asgi,Nginx
May 10 08:08:16 prod-b-web1: [pid 15053] [version 119.5.10.5086] [tenant_id -] [domain_name -] [pathname /opt/releases/r119.5.10.5086/env/lib/python3.6/site-packages/daphne/server.py] [lineno 288] [priority ERROR] [funcname application_checker] [request_path -] [request_method -] [request_data -] [request_user -] [request_stack -] Exception inside application: Lock is not acquired.
Traceback (most recent call last):
File "/opt/releases/r119.5.10.5086/env/lib/python3.6/site-packages/channels_redis/core.py", line 435, in receive
real_channel
File "/opt/releases/r119.5.10.5086/env/lib/python3.6/site-packages/channels_redis/core.py", line 484, in receive_single
await self.receive_clean_locks.acquire(channel_key)
File "/opt/releases/r119.5.10.5086/env/lib/python3.6/site-packages/channels_redis/core.py", line 152, in acquire
return await self.locks[channel].acquire()
File "/opt/python3.6/lib/python3.6/asyncio/locks.py", line 176, in acquire
yield from fut
concurrent.futures._base.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/releases/r119.5.10.5086/env/lib/python3.6/site-packages/channels/sessions.py", line 183, in __call__
return await self.inner(receive, self.send)
File "/opt/releases/r119.5.10.5086/env/lib/python3.6/site-packages/channels/middleware.py", line 41, in coroutine_call
await inner_instance(receive, send)
File "/opt/releases/r119.5.10.5086/env/lib/python3.6/site-packages/channels/consumer.py", line 59, in __call__
[receive, self.channel_receive], self.dispatch
File "/opt/releases/r119.5.10.5086/env/lib/python3.6/site-packages/channels/utils.py", line 58, in await_many_dispatch
await task
File "/opt/releases/r119.5.10.5086/env/lib/python3.6/site-packages/channels_redis/core.py", line 447, in receive
self.receive_lock.release()
File "/opt/python3.6/lib/python3.6/asyncio/locks.py", line 201, in release
raise RuntimeError('Lock is not acquired.')
RuntimeError: Lock is not acquired.
【问题讨论】:
-
你是如何管理 python 依赖的?
-
当您通过按键事件中断时,您是否收到上述错误?因为上面的堆栈跟踪指向self.stop(),这仅在应用程序的清理活动中完成,并且存在
KeyboardInterrupt类型的异常。您多久看到一次上述 stakctrace 打印? -
流量模式还是一样吗?你在文件锁方面做得如何? redis 是否存在,是否有任何与 redis 的连接被卡在关闭或类似情况下?您可以使用
ss或netstat来检查这些。 -
这看起来像是一个多线程问题。
标签: python django redis django-channels django-redis