【发布时间】:2021-09-21 14:47:03
【问题描述】:
我正在运行几个异步通信的进程。
- 进程 1 每隔一段时间向进程 2 发送一条消息。
- 进程 2 每隔一段时间检查一次消息。
- 如果没有消息可用,进程 2 会继续执行其他操作,直到下一个间隔。
问题是:如果进程 2 在进程 1 发送任何内容之前开始尝试接收,那么所有后续接收也会失败。反之,如果进程 1 在进程 2 尝试接收之前发送了任何消息,则所有后续接收都成功。
使用 Python 和 mpi4py 的最小示例:
import sys
from time import sleep
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
if rank == 0:
print("Process 1 started")
sys.stdout.flush()
sleep(3)
for i in range(10):
sys.stdout.flush()
comm.send(f"Message {i}", dest=1)
print(f"Process 1 sent message {i}")
sys.stdout.flush()
sleep(1)
elif rank == 1:
print("Process 2 started")
sys.stdout.flush()
sleep(1)
for i in range(1000):
req = comm.irecv()
success, message = req.test()
if success:
print(f"Process 2 received {message}")
else:
print(f"Process 2 didn't receive anything on attempt {i}")
sys.stdout.flush()
sleep(3)
给出以下输出:
Process 1 started
Process 2 started
Process 2 didn't receive anything on attempt 0
Process 1 sent message 0
Process 1 sent message 1
Process 2 didn't receive anything on attempt 1
Process 1 sent message 2
Process 1 sent message 3
Process 2 didn't receive anything on attempt 2
Process 1 sent message 4
Process 1 sent message 5
...
从进程 1 中删除第一个 sleep 会导致成功的发送/接收链:
Process 1 started
Process 2 started
Process 1 sent message 0
Process 1 sent message 1
Process 2 received Message 0
Process 1 sent message 2
Process 1 sent message 3
Process 2 received Message 1
Process 1 sent message 4
Process 1 sent message 5
Process 1 sent message 6
Process 2 received Message 2
我的实现中缺少什么来使第一个版本正常工作?
【问题讨论】:
-
如果在接收器中为
irecv指定source=0会怎样?