【发布时间】:2021-06-08 19:08:18
【问题描述】:
我在表 A 上有此查询,该查询在关联表 B 上执行左连接并提取与 A 上的某些条件匹配的记录或 B上的某些条件:
SELECT A.*, B.status FROM "A"
LEFT JOIN B ON B.a_id = A.id AND B.b_field = 20371
WHERE "A"."type" = 'SomeValue' AND "A"."deleted_at" IS NULL AND
(A.a_field = 20371 OR A.another_field = 69074 OR B.id IS NOT NULL)
ORDER BY "A"."updated_at" DESC LIMIT 10 OFFSET 0;
这里是解释:
Limit (cost=234623.62..234624.83 rows=10 width=635) (actual time=4034.840..4034.984 rows=10 loops=1)
-> Gather Merge (cost=234623.62..344565.43 rows=909175 width=635) (actual time=4034.839..4034.982 rows=10 loops=1)
Workers Planned: 5
Workers Launched: 0
-> Sort (cost=233623.54..234078.13 rows=181835 width=635) (actual time=4033.536..4033.540 rows=10 loops=1)
Sort Key: A.updated_at DESC
Sort Method: top-N heapsort Memory: 34kB
-> Hash Left Join (cost=113.31..229694.15 rows=181835 width=635) (actual time=5.680..4033.139 rows=79 loops=1)
Hash Cond: (A.id = B.a_id)
Filter: ((A.a_field = 20371) OR (A.another_field = 69074) OR (B.id IS NOT NULL))
Rows Removed by Filter: 860017
-> Parallel Seq Scan on A (cost=0.00..228898.94 rows=181835 width=635) (actual time=0.011..3833.346 rows=860096 loops=1)
Filter: ((deleted_at IS NULL) AND ((type)::text = 'SomeValue'::text))
Rows Removed by Filter: 5265254
-> Hash (cost=112.92..112.92 rows=31 width=8) (actual time=0.107..0.108 rows=79 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 12kB
-> Index Scan using index_B_on_b_field on B B (cost=0.42..112.92 rows=31 width=8) (actual time=0.014..0.087 rows=79 loops=1)
Index Cond: (b_field = 20371)
Planning Time: 0.790 ms
Execution Time: 4035.090 ms
如您所见,Postgres 没有利用主表上的任何索引来折扣 A 上的大部分记录(A.type、A.deleted_at 等),因为它必须扫描 B 记录。
这是不带 OR 条件的查询:
SELECT A.*, B.status FROM "A"
LEFT JOIN B ON B.a_id = A.id AND B.b_field = 20371
WHERE "A"."type" = 'SomeValue' AND "A"."deleted_at" IS NULL AND
(A.a_field = 20371 OR A.another_field = 69074)
ORDER BY "A"."updated_at" DESC LIMIT 10 OFFSET 0;
以及解释分析:
Limit (cost=1397.52..1397.55 rows=10 width=635) (actual time=0.018..0.019 rows=0 loops=1)
-> Sort (cost=1397.52..1397.64 rows=48 width=635) (actual time=0.017..0.018 rows=0 loops=1)
Sort Key: A.updated_at DESC
Sort Method: quicksort Memory: 25kB
-> Hash Left Join (cost=128.65..1396.49 rows=48 width=635) (actual time=0.013..0.014 rows=0 loops=1)
Hash Cond: (A.id = B.a_id)
-> Bitmap Heap Scan on A (cost=15.33..1282.98 rows=48 width=635) (actual time=0.012..0.013 rows=0 loops=1)
Recheck Cond: ((a_field = 20371) OR (another_field = 69074))
Filter: ((deleted_at IS NULL) AND ((type)::text = 'SomeValue'::text))
-> BitmapOr (cost=15.33..15.33 rows=325 width=0) (actual time=0.011..0.012 rows=0 loops=1)
-> Bitmap Index Scan on index_A_on_a_field (cost=0.00..10.87 rows=325 width=0) (actual time=0.006..0.006 rows=0 loops=1)
Index Cond: (a_field = 20371)
-> Bitmap Index Scan on index_A_on_another_field (cost=0.00..4.44 rows=1 width=0) (actual time=0.005..0.005 rows=0 loops=1)
Index Cond: (another_field = 69074)
-> Hash (cost=112.92..112.92 rows=31 width=4) (never executed)
-> Index Scan using index_B_on_b_field on B B (cost=0.42..112.92 rows=31 width=4) (never executed)
Index Cond: (b_field = 20371)
Planning Time: 0.552 ms
Execution Time: 0.104 ms
有没有办法我们可以重写这个查询或让 Postgres 使用主 A 表上的一些索引?
【问题讨论】:
-
主查询中不需要表 B,您可以将其移至
EXISTS(...)术语中。 -
我确实需要它。编辑查询以在 select 子句中添加来自 B 的列。 @wildplasser
-
那么,我们优化了错误的查询(计划)?
-
不,查询计划是正确的
-
"使 Postgres 能够利用主 A 表上的一些索引" 你还没有告诉我们这些是什么。
标签: sql postgresql indexing