fastapi 异步后台任务会阻塞其他请求吗？答案

【问题标题】：FastAPI asynchronous background tasks blocks other requests?fastapi 异步后台任务会阻塞其他请求吗？
【发布时间】：2021-05-19 07:57:48
【问题描述】：

我想在 fastapi 中运行一个简单的后台任务，该任务在将其转储到数据库之前涉及一些计算。但是，计算会阻止它接收更多请求。

from fastapi import BackgroundTasks, FastAPI

app = FastAPI()
db = Database()

async def task(data):
    otherdata = await db.fetch("some sql")
    newdata = somelongcomputation(data,otherdata) # this blocks other requests
    await db.execute("some sql",newdata)
   


@app.post("/profile")
async def profile(data: Data, background_tasks: BackgroundTasks):
    background_tasks.add_task(task, data)
    return {}

解决此问题的最佳方法是什么？

【问题讨论】：

如果计算量大且不涉及IO，最好使用多处理。
我正在使用 docker fastapi 进行部署，默认情况下它使用服务器的所有 cpu 核心。我不想使用像 celery 这样的其他服务，因为该产品仍处于原型设计阶段并且没有用户。

标签： python fastapi

【解决方案1】：

您的task 定义为async，这意味着fastapi（或者更确切地说是starlette）将在异步事件循环中运行它。而且因为somelongcomputation 是同步的（即不等待一些IO，而是进行计算），只要它正在运行，它就会阻塞事件循环。

我看到了一些解决这个问题的方法：

使用更多的工人（例如uvicorn main:app --workers 4）。这将允许最多 4 个somelongcomputation 并行。
将您的任务重写为不是async（即将其定义为def task(data): ... 等）。然后starlette 会在一个单独的线程中运行它。

使用fastapi.concurrency.run_in_threadpool，它也会在单独的线程中运行它。像这样：

from fastapi.concurrency import run_in_threadpool
async def task(data):
    otherdata = await db.fetch("some sql")
    newdata = await run_in_threadpool(lambda: somelongcomputation(data, otherdata))
    await db.execute("some sql", newdata)

或直接使用asyncios's run_in_executor（run_in_threadpool 在后台使用）：

import asyncio
async def task(data):
    otherdata = await db.fetch("some sql")
    loop = asyncio.get_running_loop()
    newdata = await loop.run_in_executor(None, lambda: somelongcomputation(data, otherdata))
    await db.execute("some sql", newdata)

您甚至可以将 concurrent.futures.ProcessPoolExecutor 作为第一个参数传递给 run_in_threadpool，以便在单独的进程中运行它。

自己生成一个单独的线程/进程。例如。使用concurrent.futures。
使用更重的东西，比如芹菜。（在 fastapi 文档 here 中也提到过）。

【讨论】：

我在这里遇到了同样的问题，我想知道为什么不直接使用asyncio.create_task(task(data))？我正在做一些测试，似乎是解决方案。
你的意思是不用BackgroundTasks？你确定这有效吗？因为asyncio.create_task 将在事件循环中运行任务（因此somelongcomputation），然后将被阻止，就像在问题中一样。 run_in_threadpool 起作用的原因是它直接在底层线程池中运行计算，避开了事件循环。
如果不使用async 会产生另一个线程，这不是比使用async 更好吗？
@Crashalot 视情况而定。看看这里的一些答案：stackoverflow.com/questions/27435284/…，也许在这里：discuss.python.org/t/…。
究竟需要在哪里传递concurrent.futures.ProcessPoolExecutor？在newdata = await loop.run_in_executor(ProcessPoolExecutor(), lambda: somelongcomputation(data, otherdata))?