【发布时间】:2022-06-30 23:42:36
【问题描述】:
我在 python 3.10 上运行 locust locust==2.8.6。我通过 AWS EKS 在 Kubernetes 上运行它。我分布式运行它,并尝试设置 1 个 master 和 5 个 worker。
主 pod 以命令开头:
command: ["locust"]
args: ["-f","$filename","--headless","--users=$clients","--spawn-rate=$hatch-rate","--run-time=$run-time","--only-summary","--master","--expect-workers=$num_slaves"]
工人从命令开始:
command: ["locust"]
args: ["-f","$filename","--worker","--master-host=locust-master$task_id"]
确实,在工作 pod 上,我可以运行 telnet locust-master1 5557 并确认通信。 (在这种情况下,$task_id=1)
我在主 pod 中看到如下日志:
[2022-04-27 22:53:16,969] locust-master1--1-z2lr8/INFO/root: Waiting for workers to be ready, 0 of 5 connected
[2022-04-27 22:53:17,109] locust-master1--1-z2lr8/INFO/locust.runners: Client 'locust-slave1-tt7n5_fec1320a406b42319f3088bd9a7c181c' reported as ready. Currently 1 clients ready to swarm.
[2022-04-27 22:53:17,147] locust-master1--1-z2lr8/INFO/locust.runners: Client 'locust-slave1-qv7kt_011dbeb9f15d452f935c5643fb463632' reported as ready. Currently 2 clients ready to swarm.
[2022-04-27 22:53:17,261] locust-master1--1-z2lr8/INFO/locust.runners: Client 'locust-slave1-ks5wb_356fcf54ac2644e4badc684e3846520c' reported as ready. Currently 3 clients ready to swarm.
[2022-04-27 22:53:17,354] locust-master1--1-z2lr8/INFO/locust.runners: Client 'locust-slave1-cbkbd_2c90cedde5224e1e9cf47bbb543b9097' reported as ready. Currently 4 clients ready to swarm.
[2022-04-27 22:53:17,364] locust-master1--1-z2lr8/INFO/locust.runners: Client 'locust-slave1-xfvsz_196bba3928c5491e896acd411798d48d' reported as ready. Currently 5 clients ready to swarm.
[2022-04-27 22:53:17,970] locust-master1--1-z2lr8/INFO/locust.main: Run time limit set to 5400 seconds
[2022-04-27 22:53:17,971] locust-master1--1-z2lr8/INFO/locust.main: Starting Locust 2.8.6
[2022-04-27 22:53:17,971] locust-master1--1-z2lr8/INFO/locust.runners: Sending spawn jobs of 50 users at 0.50 spawn rate to 5 ready clients
[2022-04-27 22:53:17,977] locust-master1--1-z2lr8/INFO/locust_submit_judgments: Locust Startup: job_id: 1434194
[2022-04-27 22:53:18,376] locust-master1--1-z2lr8/INFO/locust.runners: Worker locust-slave1-cbkbd_2c90cedde5224e1e9cf47bbb543b9097 failed to send heartbeat, setting state to missing.
[2022-04-27 22:53:20,384] locust-master1--1-z2lr8/INFO/locust.runners: Worker locust-slave1-qv7kt_011dbeb9f15d452f935c5643fb463632 failed to send heartbeat, setting state to missing.
[2022-04-27 22:53:20,385] locust-master1--1-z2lr8/INFO/locust.runners: Worker locust-slave1-ks5wb_356fcf54ac2644e4badc684e3846520c failed to send heartbeat, setting state to missing.
[2022-04-27 22:53:22,391] locust-master1--1-z2lr8/INFO/locust.runners: Worker locust-slave1-tt7n5_fec1320a406b42319f3088bd9a7c181c failed to send heartbeat, setting state to missing.
[2022-04-27 22:53:22,391] locust-master1--1-z2lr8/INFO/locust.runners: Worker locust-slave1-xfvsz_196bba3928c5491e896acd411798d48d failed to send heartbeat, setting state to missing.
[2022-04-27 22:53:22,392] locust-master1--1-z2lr8/INFO/locust.runners: The last worker went missing, stopping test.
[2022-04-27 22:53:22,392] locust-master1--1-z2lr8/INFO/locust_submit_judgments: Locust Teardown: sending query messages to Results DB
所以我确实看到工人注册了自己,但是一旦测试开始,主 pod 就会说工人无法发送心跳并将它们设置为丢失。如果我在没有--headless 的情况下运行主 pod,这意味着我可以打开 Web UI 并手动启动作业。我看到了同样的问题:当我手动启动作业时,会出现相同的心跳消息。
在工作 pod 上,我看到了我的调试启动日志,但没有任何迹象表明存在问题。
我在网上找不到关于如何设置分布式 locust 的指南(除了它被称为 locustio 和 0.x 版本时),从那时起情况发生了很大变化。
这里需要设置什么?如果不包含多行设置代码,我不确定要包含哪些代码。我正在尝试针对 postgres 进行测试,因此我正在考虑关注 https://docs.locust.io/en/stable/testing-other-systems.html,但在所有示例中,它们都包装了与我继承的代码不同的属性。
【问题讨论】:
标签: python kubernetes locust