分区表中的主键扫描答案

【问题标题】：Primary key scanning in partitioned table分区表中的主键扫描
【发布时间】：2021-07-03 05:50:22
【问题描述】：

我有一个非常大的表，我需要按日期对它进行分区（在我的情况下通过触发器）。我遇到的问题是我可以通过时间戳过滤器非常快地获取数据，但是在通过主键提取特定行的数据时无法获得良好的性能。

主表是：

CREATE TABLE parent_table (
    guid uuid NOT NULL DEFAULT uuid_generate_v4(), -- This is gonna be the primary key
    tm timestamptz NOT NULL, -- Timestamp, on which paritions are based
    value int4 not null default -1, -- Just a value
    CONSTRAINT z_detections_pk PRIMARY KEY (guid)
);
CREATE INDEX parent_table_tm_idx ON dev.dev_z_detections USING btree (tm DESC);

如果有新的日期，我会创建一个简单的触发器来创建新的分区

CREATE OR REPLACE FUNCTION parent_table_insert_fn()
 RETURNS trigger
 LANGUAGE plpgsql
AS $function$
DECLARE
    schema_name           varchar(255)        := 'public';
    table_master          varchar(255)        := 'parent_table';
    table_part            varchar(255)        := '';
    table_date_underscore varchar(255)        := '';
    constraint_tm_start timestamp with time zone;
    constraint_tm_end timestamp with time zone;
BEGIN
    table_part := table_master || '_' || to_char(timezone('utc', new.tm), 'YYYY_MM_DD');
    table_date_underscore := '' || to_char(timezone('utc', new.tm), 'YYYY_MM_DD');
    PERFORM
        1
    from
        information_schema.tables 
    WHERE 
      table_schema = schema_name
      AND table_name = table_part
    limit 1;
    IF NOT FOUND
    then
        constraint_tm_start := to_char(timezone('utc', new.tm), 'YYYY-MM-DD')::timestamp at time zone 'utc';
        constraint_tm_end := constraint_tm_start + interval '1 day';
    execute '
      CREATE TABLE ' || schema_name || '.' || table_part || ' (
          CONSTRAINT parent_table_' || table_date_underscore || '_pk PRIMARY KEY (guid),
          CONSTRAINT parent_table_' || table_date_underscore || '_ck CHECK ( tm >= ' || QUOTE_LITERAL(constraint_tm_start) || ' and tm < ' || QUOTE_LITERAL(constraint_tm_end) || ' )
      ) INHERITS (' || schema_name || '.' || table_master || ');
      CREATE INDEX parent_table_' || table_date_underscore || '_tidx ON ' || schema_name || '.' || table_part || ' USING btree (tm desc);
    ';
    END IF;
    execute '
        INSERT INTO ' || schema_name || '.' || table_part || '
        SELECT ( (' || QUOTE_LITERAL(NEW) || ')::' || schema_name || '.' || TG_RELNAME || ' ).*;';   
    RETURN NULL;
END;
$function$
;

在父表上启用触发器：

create trigger parent_table_insert_fn_trigger before insert
on parent_table for each row execute function parent_table_insert_fn();

并在其中插入一些数据：

insert into parent_table(guid, tm, value)
values
('1f4835c0-2b22-4cfc-ab3c-940af679ace6', '2021-04-06 14:00:00+03:00', 1),
('5ca37d57-e79e-4e1f-ace7-91eb671f3a82', '2021-04-07 15:30:00+03:00', 2),
('b57bfbf6-7ed0-4dde-a40b-9fa2e6f24808', '2021-04-07 17:10:00+03:00', 3),
('ad69cd35-5b20-466f-9d5c-61fa5d41bc5f', '2021-04-08 16:50:00+03:00', 66),
('bb0ec87a-72bb-438e-8f4c-2cdc3ae7d525', '2021-03-21 19:00:00+03:00', -10);

经过这些操作后，我得到了 4 张桌子：

parent_table
parent_table_2021_03_21
parent_table_2021_04_06
parent_table_2021_04_07
parent_table_2021_04_08

检查索引是否适用于时间戳过滤器：

explain analyze
select * from parent_table where tm > '2021-04-07 10:00:00+03:00' and tm <= '2021-04-07 16:30:00+03:00';

> > >
Append  (cost=0.00..14.43 rows=8 width=28) (actual time=0.017..0.020 rows=1 loops=1)
  ->  Seq Scan on parent_table parent_table_1  (cost=0.00..0.00 rows=1 width=28) (actual time=0.002..0.002 rows=0 loops=1)
        Filter: ((tm > '2021-04-07 10:00:00+03'::timestamp with time zone) AND (tm <= '2021-04-07 16:30:00+03'::timestamp with time zone))
  ->  Bitmap Heap Scan on parent_table_2021_04_07 parent_table_2  (cost=4.22..14.39 rows=7 width=28) (actual time=0.013..0.015 rows=1 loops=1)
        Recheck Cond: ((tm > '2021-04-07 10:00:00+03'::timestamp with time zone) AND (tm <= '2021-04-07 16:30:00+03'::timestamp with time zone))
        Heap Blocks: exact=1
        ->  Bitmap Index Scan on parent_table_2021_04_07_tidx  (cost=0.00..4.22 rows=7 width=0) (actual time=0.008..0.008 rows=1 loops=1)
              Index Cond: ((tm > '2021-04-07 10:00:00+03'::timestamp with time zone) AND (tm <= '2021-04-07 16:30:00+03'::timestamp with time zone))
Planning Time: 0.381 ms
Execution Time: 0.053 ms

这很好，可以按我的预期工作。

但是通过某些主键选择会给我下一个分析的输出：

explain analyze
select * from parent_table where guid = 'b57bfbf6-7ed0-4dde-a40b-9fa2e6f24808';

> > >
Append  (cost=0.00..32.70 rows=5 width=28) (actual time=0.021..0.035 rows=1 loops=1)
  ->  Seq Scan on parent_table parent_table_1  (cost=0.00..0.00 rows=1 width=28) (actual time=0.003..0.004 rows=0 loops=1)
        Filter: (guid = 'b57bfbf6-7ed0-4dde-a40b-9fa2e6f24808'::uuid)
  ->  Index Scan using parent_table_2021_04_06_pk on parent_table_2021_04_06 parent_table_2  (cost=0.15..8.17 rows=1 width=28) (actual time=0.008..0.008 rows=0 loops=1)
        Index Cond: (guid = 'b57bfbf6-7ed0-4dde-a40b-9fa2e6f24808'::uuid)
  ->  Index Scan using parent_table_2021_04_07_pk on parent_table_2021_04_07 parent_table_3  (cost=0.15..8.17 rows=1 width=28) (actual time=0.008..0.009 rows=1 loops=1)
        Index Cond: (guid = 'b57bfbf6-7ed0-4dde-a40b-9fa2e6f24808'::uuid)
  ->  Index Scan using parent_table_2021_04_08_pk on parent_table_2021_04_08 parent_table_4  (cost=0.15..8.17 rows=1 width=28) (actual time=0.004..0.004 rows=0 loops=1)
        Index Cond: (guid = 'b57bfbf6-7ed0-4dde-a40b-9fa2e6f24808'::uuid)
  ->  Index Scan using parent_table_2021_03_21_pk on parent_table_2021_03_21 parent_table_5  (cost=0.15..8.17 rows=1 width=28) (actual time=0.006..0.006 rows=0 loops=1)
        Index Cond: (guid = 'b57bfbf6-7ed0-4dde-a40b-9fa2e6f24808'::uuid)
Planning Time: 0.345 ms
Execution Time: 0.076 ms

而且这个查询给了我很差的性能（我猜？），尤其是在非常大的分区表上，比如每个分区有 10M+ 行。

所以我的问题是：我应该怎么做才能避开分区扫描以进行简单的主键查找？

注意：我使用的是 PostgreSQL 13.1

更新 2021-04-07 15:22+03:00：所以，在半生产表中我有这样的结果：

时间戳过滤器

Append  (cost=0.00..809.35 rows=16616 width=32) (actual time=0.037..5.612 rows=16865 loops=1)
  ->  Seq Scan on wifi_logs t_1  (cost=0.00..0.00 rows=1 width=32) (actual time=0.010..0.011 rows=0 loops=1)
        Filter: ((tm >= '2020-04-07 14:00:00+03'::timestamp with time zone) AND (tm <= '2020-04-07 17:00:00+03'::timestamp with time zone))
  ->  Index Scan using wifi_logs_tm_idx_2020_04_07 on wifi_logs_2020_04_07 t_2  (cost=0.29..726.27 rows=16615 width=32) (actual time=0.026..4.655 rows=16865 loops=1)
        Index Cond: ((tm >= '2020-04-07 14:00:00+03'::timestamp with time zone) AND (tm <= '2020-04-07 17:00:00+03'::timestamp with time zone))
Planning Time: 14.869 ms
Execution Time: 6.151 ms

GUID（主键过滤器）

  ->  Seq Scan on wifi_logs t_1  (cost=0.00..0.00 rows=1 width=32) (actual time=0.015..0.016 rows=0 loops=1)
        Filter: (guid = '78bc5537-4f2f-4e83-8abd-4241ac3f9f27'::uuid)
  ->  Seq Scan on wifi_logs_2014_12_04 t_4  (cost=0.00..1.01 rows=1 width=32) (actual time=0.006..0.006 rows=0 loops=1)
        Filter: (guid = '78bc5537-4f2f-4e83-8abd-4241ac3f9f27'::uuid)
        Rows Removed by Filter: 1
  --
  -- TONS OF PARTITION TABLE SCANS
  ---
  ->  Index Scan using wifi_logs_2021_03_18_pk on wifi_logs_2021_03_18 t_387  (cost=0.42..8.44 rows=1 width=32) (actual time=0.011..0.011 rows=0 loops=1)
        Index Cond: (guid = '78bc5537-4f2f-4e83-8abd-4241ac3f9f27'::uuid)
  ->  Seq Scan on wifi_logs_1970_01_01 t_388  (cost=0.00..3.60 rows=1 width=32) (actual time=0.020..0.020 rows=0 loops=1)
        Filter: (guid = '78bc5537-4f2f-4e83-8abd-4241ac3f9f27'::uuid)
        Rows Removed by Filter: 119
  ->  Index Scan using wifi_logs_2021_03_19_pk on wifi_logs_2021_03_19 t_389  (cost=0.42..8.44 rows=1 width=32) (actual time=0.012..0.012 rows=0 loops=1)
        Index Cond: (guid = '78bc5537-4f2f-4e83-8abd-4241ac3f9f27'::uuid)
  --
  -- ANOTHER TONS OF PARTITION TABLE SCANS
  ---
  ->  Index Scan using wifi_logs_2021_04_07_pk on wifi_logs_2021_04_07 t_408  (cost=0.42..8.44 rows=1 width=32) (actual time=0.010..0.010 rows=0 loops=1)
        Index Cond: (guid = '78bc5537-4f2f-4e83-8abd-4241ac3f9f27'::uuid)
Planning Time: 97.662 ms
Execution Time: 36.756 ms

【问题讨论】：

您应该在 Postgres 中使用本机分区，这比基于继承的分区要快得多。但无论如何：如果您的查询不包含分区键，那么这将始终比在非分区表上执行相同操作要慢。
执行时间：0.076 ms，你要什么样的性能？
@FrankHeikens 如果我有 1500 多个分区，实际查询速度会非常慢（不如我查询没有这些分区的单个大表那么快）upd: 并且 0.076 仍然比0.053 用于更复杂的条件（时间戳过滤）
@a_horse_with_no_name 我会更新问题
@a_horse_with_no_name 更新的问题：刚刚给出了半生产数据库的示例输出

标签： sql postgresql indexing plpgsql

【解决方案1】：

这是正常的，除了

创建更少的分区，这样你就必须扫描更少的分区
在查询中添加tm 的条件以避免全部扫描

您会注意到计划时间大大超过了查询执行时间。为了帮助解决这个问题，您可以

创建更少的分区，让优化器做的工作更少
使用准备好的语句来避免计划工作

【讨论】：