Amazon RDS - Postgres 不使用 SELECT 查询的索引答案

【问题标题】：Amazon RDS - Postgres not using index for SELECT queriesAmazon RDS - Postgres 不使用 SELECT 查询的索引
【发布时间】：2020-09-01 02:47:56
【问题描述】：

我有一种感觉，我做错了什么，但我似乎无法弄清楚。

我正在尝试执行以下查询：

Select col1, col2, col3, col4, col5, day, month, year,
       sum(num1) as sum_num1, 
       sum(num2) as sum_num2,
       count(*) as count_items
from test_table where day = 10 and month = 5 and year = 2020
group by col1, col2, col3, col4, col5, day, month, year;

另外，我在day, month, year 上有一个索引，我使用以下命令设置了该索引

CREATE INDEX CONCURRENTLY testtable_dmy_idx on test_table (day, month, year);

现在我想出了设置顺序扫描开/关的设置，并尝试使用查询。

因此，当使用SET enable_seqscan TO on;（顺便说一句，这是默认行为）和EXPLAIN (analyze,buffers,timing) 运行上一个查询时，我得到以下输出：

-- Select Query with Sequential scan on 

QUERY PLAN
Finalize GroupAggregate  (cost=9733303.39..10836008.34 rows=5102790 width=89) (actual time=1100914.091..1110820.480 rows=491640 loops=1)
"  Group Key: col1, col2, col3, col4, col5, day, month, year"
"  Buffers: shared hit=25020 read=2793049 dirtied=10040, temp read=74932 written=75039"
  I/O Timings: read=1059425.134
  ->  Gather Merge  (cost=9733303.39..10607468.38 rows=6454984 width=89) (actual time=1100911.426..1110193.876 rows=795097 loops=1)
        Workers Planned: 2
        Workers Launched: 2
"        Buffers: shared hit=76964 read=8416562 dirtied=33686, temp read=230630 written=230956"
        I/O Timings: read=3178066.529
        ->  Partial GroupAggregate  (cost=9732303.36..9861403.04 rows=3227492 width=89) (actual time=1100791.915..1107668.495 rows=265032 loops=3)
"              Group Key: col1, col2, col3, col4, col5, day, month, year"
"              Buffers: shared hit=76964 read=8416562 dirtied=33686, temp read=230630 written=230956"
              I/O Timings: read=3178066.529
              ->  Sort  (cost=9732303.36..9740372.09 rows=3227492 width=81) (actual time=1100788.479..1105630.411 rows=2630708 loops=3)
"                    Sort Key: col1, col2, col3, col4, col5"
                    Sort Method: external merge  Disk: 241320kB
                    Worker 0:  Sort Method: external merge  Disk: 246776kB
                    Worker 1:  Sort Method: external merge  Disk: 246336kB
"                    Buffers: shared hit=76964 read=8416562 dirtied=33686, temp read=230630 written=230956"
                    I/O Timings: read=3178066.529
                    ->  Parallel Seq Scan on test_table  (cost=0.00..9074497.49 rows=3227492 width=81) (actual time=656277.982..1073808.146 rows=2630708 loops=3)
                          Filter: ((day = 10) AND (month = 5) AND (year = 2020))
                          Rows Removed by Filter: 24027044
                          Buffers: shared hit=76855 read=8416561 dirtied=33686
                          I/O Timings: read=3178066.180
Planning Time: 4.017 ms
Execution Time: 1111033.041 ms
Total time - Around 18 minutes

然后当我设置 SET enable_seqscan TO off; 并使用 Explain 运行相同的查询时，我得到以下信息：

-- Select Query with Sequential scan off

QUERY PLAN
Finalize GroupAggregate  (cost=10413126.05..11515831.01 rows=5102790 width=89) (actual time=59211.363..66579.750 rows=491640 loops=1)
"  Group Key: col1, col2, col3, col4, col5, day, month, year"
"  Buffers: shared hit=3 read=104091, temp read=77942 written=78052"
  I/O Timings: read=28662.857
  ->  Gather Merge  (cost=10413126.05..11287291.05 rows=6454984 width=89) (actual time=59211.262..65973.857 rows=795178 loops=1)
        Workers Planned: 2
        Workers Launched: 2
"        Buffers: shared hit=33 read=218096, temp read=230092 written=230418"
        I/O Timings: read=51560.508
        ->  Partial GroupAggregate  (cost=10412126.03..10541225.71 rows=3227492 width=89) (actual time=57013.922..62453.555 rows=265059 loops=3)
"              Group Key: col1, col2, col3, col4, col5, day, month, year"
"              Buffers: shared hit=33 read=218096, temp read=230092 written=230418"
              I/O Timings: read=51560.508
              ->  Sort  (cost=10412126.03..10420194.76 rows=3227492 width=81) (actual time=57013.423..60368.530 rows=2630708 loops=3)
"                    Sort Key: col1, col2, col3, col4, col5"
                    Sort Method: external merge  Disk: 246944kB
                    Worker 0:  Sort Method: external merge  Disk: 246120kB
                    Worker 1:  Sort Method: external merge  Disk: 241408kB
"                    Buffers: shared hit=33 read=218096, temp read=230092 written=230418"
                    I/O Timings: read=51560.508
                    ->  Parallel Bitmap Heap Scan on test_table  (cost=527733.84..9754320.16 rows=3227492 width=81) (actual time=18155.864..30957.312 rows=2630708 loops=3)
                          Recheck Cond: ((day = 10) AND (month = 5) AND (year = 2020))
                          Rows Removed by Index Recheck: 1423
                          Heap Blocks: exact=13374 lossy=44328
                          Buffers: shared hit=3 read=218096
                          I/O Timings: read=51560.508
                          ->  Bitmap Index Scan on testtable_dmy_idx  (cost=0.00..525797.34 rows=7745982 width=0) (actual time=18148.218..18148.228 rows=7892123 loops=1)
                                Index Cond: ((day = 10) AND (month = 5) AND (year = 2020))
                                Buffers: shared hit=3 read=46389
                                I/O Timings: read=17368.250
Planning Time: 2.787 ms
Execution Time: 66783.481 ms
Total Time - Around 1 min

我似乎不明白为什么我会出现这种行为或我做错了什么，因为我希望 Postgres 能够自动优化查询，但这并没有发生。

任何帮助将不胜感激。

编辑 1：

更多关于 RDS postgres 版本的信息：

SELECT version();

x86_64-pc-linux-gnu 上的 PostgreSQL 11.5，由 gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-9) 编译，64 位

编辑 2：

以SET max_parallel_workers_per_gather TO 0 运行默认为2（如SHOW max_parallel_workers_per_gather 所示）

-- Select Query with Sequential scan ON
QUERY PLAN
GroupAggregate  (cost=11515667.22..11799074.58 rows=5102790 width=89) (actual time=1120868.377..1133231.165 rows=491640 loops=1)
"  Group Key: col1, col2, col3, col4, col5, day, month, year"
"  Buffers: shared hit=92456 read=8400966, temp read=295993 written=296321"
  I/O Timings: read=1041723.362
  ->  Sort  (cost=11515667.22..11535032.17 rows=7745982 width=81) (actual time=1120865.304..1129419.809 rows=7892123 loops=1)
"        Sort Key: col1, col2, col3, col4, col5"
        Sort Method: external merge  Disk: 734304kB
"        Buffers: shared hit=92456 read=8400966, temp read=295993 written=296321"
        I/O Timings: read=1041723.362
        ->  Seq Scan on test_table  (cost=0.00..9888011.58 rows=7745982 width=81) (actual time=663266.269..1070560.993 rows=7892123 loops=1)
              Filter: ((day = 10) AND (month = 5) AND (year = 2020))
              Rows Removed by Filter: 72081131
              Buffers: shared hit=92450 read=8400966
              I/O Timings: read=1041723.362
Planning Time: 5.829 ms
Execution Time: 1133422.968 ms
Total Time - Around 18 mins

随后，

-- Select Query with Sequential scan OFF
QUERY PLAN
GroupAggregate  (cost=12190966.21..12474373.57 rows=5102790 width=89) (actual time=109048.306..119255.079 rows=491640 loops=1)
"  Group Key: col1, col2, col3, col4, col5, day, month, year"
"  Buffers: shared hit=3 read=218096, temp read=295993 written=296321"
  I/O Timings: read=55697.723
  ->  Sort  (cost=12190966.21..12210331.17 rows=7745982 width=81) (actual time=109047.621..115468.268 rows=7892123 loops=1)
"        Sort Key: col1, col2, col3, col4, col5"
        Sort Method: external merge  Disk: 734304kB
"        Buffers: shared hit=3 read=218096, temp read=295993 written=296321"
        I/O Timings: read=55697.723
        ->  Bitmap Heap Scan on test_table  (cost=527733.84..10563310.57 rows=7745982 width=81) (actual time=16941.764..62203.367 rows=7892123 loops=1)
              Recheck Cond: ((day = 10) AND (month = 5) AND (year = 2020))
              Rows Removed by Index Recheck: 4270
              Heap Blocks: exact=39970 lossy=131737
              Buffers: shared hit=3 read=218096
              I/O Timings: read=55697.723
              ->  Bitmap Index Scan on testtable_dmy_idx  (cost=0.00..525797.34 rows=7745982 width=0) (actual time=16933.964..16933.964 rows=7892123 loops=1)
                    Index Cond: ((day = 10) AND (month = 5) AND (year = 2020))
                    Buffers: shared hit=3 read=46389
                    I/O Timings: read=16154.294
Planning Time: 3.684 ms
Execution Time: 119440.147 ms
Total Time - Around 2 mins

编辑 3：

我使用以下方法检查了插入、更新、删除、活动和死元组的数量

SELECT n_tup_ins as "inserts",n_tup_upd as "updates",n_tup_del as "deletes", n_live_tup as "live_tuples", n_dead_tup as "dead_tuples"
FROM pg_stat_user_tables
where relname = 'test_table';

得到以下结果

| inserts     | updates | deletes   | live_tuples | dead_tuples |
|-------------|---------|-----------|-------------|-------------|
| 296590964   | 0       | 412400995 | 79717032    | 7589442     |

运行以下命令

VACUUM (VERBOSE, ANALYZE) test_table

得到以下结果：

[2020-05-15 18:34:08] [00000] vacuuming "public.test_table"
[2020-05-15 18:37:13] [00000] scanned index "testtable_dmy_idx" to remove 7573896 row versions
[2020-05-15 18:37:56] [00000] scanned index "testtable_unixts_idx" to remove 7573896 row versions
[2020-05-15 18:38:16] [00000] "test_table": removed 7573896 row versions in 166450 pages
[2020-05-15 18:38:16] [00000] index "testtable_dmy_idx" now contains 79973254 row versions in 1103313 pages
[2020-05-15 18:38:16] [00000] index "testtable_unixts_idx" now contains 79973254 row versions in 318288 pages
[2020-05-15 18:38:16] [00000] "test_table": found 99 removable, 2196653 nonremovable row versions in 212987 out of 8493416 pages
[2020-05-15 18:38:16] [00000] vacuuming "pg_toast.pg_toast_25023"
[2020-05-15 18:38:16] [00000] index "pg_toast_25023_index" now contains 0 row versions in 1 pages
[2020-05-15 18:38:16] [00000] "pg_toast_25023": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
[2020-05-15 18:38:16] [00000] analyzing "public.test_table"
[2020-05-15 18:38:27] [00000] "test_table": scanned 30000 of 8493416 pages, containing 282611 live rows and 0 dead rows; 30000 rows in sample, 80011093 estimated total rows
[2020-05-15 18:38:27] completed in 4 m 19 s 58 ms

之后，同一个查询的结果是这样的：

| inserts   | updates | deletes   | live_tuples | dead_tuples |
|-----------|---------|-----------|-------------|-------------|
| 296590964 | 0       | 412400995 | 80011093    | 0           |

【问题讨论】：

您是否在桌子上运行了 ANALYZE 和/或 VACUUM ？ work_mem 的值是多少？
并行计划比非并行计划更难解释。你可以重复 max_parallel_workers_per_gather=0 吗？希望我们学到的任何经验都能转化为平行。
您是否交替重复执行这些计划以排除缓存影响？
@jjanes 添加了带有编辑的版本信息，尽管这里也有，PostgreSQL 11.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-9), 64-bit
将日期拆分为三个 {year,month,day} 列会导致三个低节奏列，索引对它们几乎没有影响。（一个复杂的问题可能是三个列可以为空，这可能会导致另一个级别的灾难）

标签： sql postgresql performance amazon-rds

【解决方案1】：

一般来说，特别是对于您的查询，GROUP BY 查询中的 COUNT(*) 和 SUM(...) 往往是性能杀手。原因是为了得到每个多列组的计数和总和，Postgres 必须访问索引中每个记录的表示。因此，Postgres 无法在逻辑上消除任何记录，在这种情况下往往不会使用索引。

GROUP BY 查询中使用索引的场景是，如果查询有一个 HAVING 子句，该子句使用了某些列的 MIN 或 MAX ）。此外，如果您的查询有WHERE 子句，则索引可能在那里可用。但是，您当前的查询无法进行太多优化。

【讨论】：

嘿蒂姆，如果是这样的话，为什么 postgres 不选择更高效的方式来运行查询呢？为什么我必须禁用 seq_scan 然后运行查询？
在这种情况下，索引扫描是一种优化，规划器只是没有选择它。

【解决方案2】：

   Rows Removed by Filter: 24027044
   Buffers: shared hit=76855 read=8416561 dirtied=33686
   I/O Timings: read=3178066.180

在 seq 扫描中，有很多缓冲区被弄脏了。我猜你最近没有足够地吸尘你的桌子。或者 autovac 落后了，因为您接受了默认设置，这对于大多数现代专用系统来说太慢了（直到 v12）。

此外，24027044 / 8416561 = 每页大约 2.85 行。这是一个极低的数字。你的元组非常宽吗？你的桌子是不是特别臃肿？但是这些都不能回答你的问题，因为规划者应该了解它们并考虑到它们。但是我们可能需要知道计划者哪里出错了。（这些计算可能是错误的，因为我不知道哪些数字是按比例分配的工人数量，哪些不是——但我不认为 3 的因素会改变这个结论，即这里的东西很奇怪）。

8416561 * 1024 * 8 / 3178.066 /1024 /1024 = 20 MB/S。这似乎相当低。您在 RDS“硬件”上配置了哪些 IO 设置？您的 seq_page_cost 和 random_page_cost 设置可能与您的实际 IO 容量不符。（虽然这可能不是很有效，见下文）

对于您的位图堆扫描：

Heap Blocks: exact=13374 lossy=44328
Buffers: shared hit=3 read=218096

看起来所有符合条件的元组都集中在极少数块中（与 seq 扫描显示的整体表大小相比）。我认为规划器在位图扫描中没有充分考虑到这一点。有一个patch out there for this，但它已经错过了 v13 的最后期限。（如果没有人来查看它，它也可能会错过 v14 的最后期限——轻推。）基本上，计划者知道“天”列与表格的物理顺序有很高的相关性，它使用这个知识点说位图堆扫描几乎都是顺序IO。但它也未能推断出它只会扫描表的一小部分。这个问题使位图扫描看起来就像 seq 扫描，但有额外的开销层（咨询索引），因此使用它也就不足为奇了。

【讨论】：

非常感谢您的回答，现在很有意义。我添加了 max_parallel_workers_per_gather=0 的分析。花费的时间或多或少是一样的。
@jjanes OP 似乎在 {day,month,year} 有一个索引，它可能是 {year,month,day}
@wildplasser 是的。我认为他们对周期性而不是年表非常感兴趣。尽管手头的查询并未表明这一点，但我们看不到他们所有的其他查询。您认为重新订购会有所不同吗？