你在这里把事情复杂化了。您不需要在一个循环中运行多个查询,只需使用一个单个查询:
SELECT d1.name as drug_1, d2.name as drug_2, description
FROM interactions i
INNER JOIN drugs d1 ON i.id1 = d1.id
INNER JOIN drugs d2 ON i.id2 = d2.id
WHERE
d1.id in (... id list, see below ...)
AND d2.id = (... same id list, see below ...)
AND d1.id < d2.id
我在这里使用INNER JOIN 语法而不是FROM 子句中的多个表来将连接条件分组到一个专用位置,因此WHERE 条件更容易推理。
以上将您的所有drug_list["drug_list_ids"] id 传递给in (....) 条件,但随后将数据库限制为仅使用带有d1.id < d2.id 子句的有效组合。这会在 d1.id 和 d2.id 之间生成一整套可能的(有序)组合,就像你的 for 循环一样,尽管有严格的排序顺序(使用 (8, 1548) 和 (8, 3579) 而不是 (1548, 8) 和 (3579, 8))。
Psycopg2 实际上是accepts tuples as placeholder values,并将它们扩展为... IN ... 测试的正确语法;在这种情况下,驱动程序包括括号:
query_string = """\
SELECT d1.name as drug_1, d2.name as drug_2, description
FROM interactions i
INNER JOIN drugs d1 ON i.id1 = d1.id
INNER JOIN drugs d2 ON i.id2 = d2.id
WHERE
d1.id in %s
AND d2.id in %s
AND d1.id < d2.id
"""
with pg_get_cursor(pool) as cursor:
cursor.execute(query_string, (
tuple(drug_list["drug_list_ids"]),
tuple(drug_list["drug_list_ids"])
))
ddi_list = cursor.fetchall()
或者您可以使用 Postgres ... = ANY(ARRAY[...]) test 而不是 ... IN ...,并利用 psycopg2 interpolates lists as ARRAY values 的事实:
query_string = """\
SELECT d1.name as drug_1, d2.name as drug_2, description
FROM interactions i
INNER JOIN drugs d1 ON i.id1 = d1.id
INNER JOIN drugs d2 ON i.id2 = d2.id
WHERE
d1.id = ANY(%s)
AND d2.id = ANY(%s)
AND d1.id < d2.id
"""
with pg_get_cursor(pool) as cursor:
cursor.execute(query_string, (drug_list["drug_list_ids"], drug_list["drug_list_ids"]))
ddi_list = cursor.fetchall()
如果这是不可能的,那么将你的循环变成一个列表理解有点棘手。不是因为列表推导不能处理嵌套循环(只是按嵌套顺序,从左到右列出它们),而是因为您需要在循环体中使用 multiple statements 来生成结果值。不过,因为 psycopg2 的 cursor.execute() always returns None,你可以使用 cursor.execute(...) or cursor 来产生下一个迭代器来循环,所以你会有类似的东西:
[v ... for ... in outer loops ... for v in (cursor.execute(...) or cursor)]
这利用了您可以直接在游标上循环来获取行的事实。无论如何,无需致电cursor.fetchall(),也无需测试该特定查询是否有结果。
您的嵌套for 循环可以更简洁地表示为itertools.combinations():
from itertools import combinations
query_string = """\
SELECT d1.name as drug_1, d2.name as drug_2, description
FROM interactions i
INNER JOIN drugs d1 ON i.id1 = d1.id
INNER JOIN drugs d2 ON i.id2 = d2.id
WHERE d1.id = %s AND d2.id = %s
"""
with pg_get_cursor(pool) as cursor:
combos = combinations(drug_list["drug_list_ids"], r=2)
ddi_list = [v for id1, id2 in combos for v in (cursor.execute(query_string, (id1, id2)) or cursor)]
但是,这根本不是有效的(向数据库发送大量单独的查询),也不是那么可读。而且不是必须的,如上图。
如果您还必须对 id 配对更严格地控制,则必须使用嵌套元组测试;将d1.id 和d2.id 列放入一个数组中,并使用右侧的IN ((v1, v2), (v3, v4), ...) 测试,作为元组的元组传递给cursor.execute():
from itertools import combinations
query_string = """\
SELECT d1.name as drug_1, d2.name as drug_2, description
FROM interactions i
INNER JOIN drugs d1 ON i.id1 = d1.id
INNER JOIN drugs d2 ON i.id2 = d2.id
WHERE
(d1.id, d2.id) IN %s
"""
# list of [id1, id2] lists
combos = tuple(combinations(drug_list["drug_list_ids"], r=2))
with pg_get_cursor(pool) as cursor:
cursor.execute(query_string, (combos,))
ddi_list = cursor.fetchall()