【发布时间】:2020-01-02 12:24:24
【问题描述】:
我有一个预订和一个客户表,具有以下架构:
预订表:
Table "public.booking"
Column | Type | Collation | Nullable | Default
-----------------------+--------------------------+-----------+----------+---------
deleted | boolean | | |
booking_id | character varying | | not null |
reference_number | character varying | | |
checkin_date | timestamp with time zone | | |
checkout_date | timestamp with time zone | | |
status | character varying | | |
version | integer | | not null |
comments | text | | |
extra_information | json | | |
cancellation_reason | character varying | | |
cancellation_datetime | timestamp with time zone | | |
created_at | timestamp with time zone | | not null | now()
modified_at | timestamp with time zone | | not null | now()
Indexes:
"booking_pkey" PRIMARY KEY, btree (booking_id)
"ix_booking_reference_number" UNIQUE, btree (reference_number)
"idx_booking_sort_checkin" btree (checkin_date, created_at)
"idx_booking_sort_checkout" btree (checkout_date, created_at)
"idx_booking_stay_dates" btree (checkin_date, checkout_date DESC)
"ix_booking_deleted" btree (deleted)
"ix_booking_status" btree (status)
"trgm_booking_ref_num" gist (reference_number gist_trgm_ops)
客户表:
Table "public.booking_customer"
Column | Type | Collation | Nullable | Default
-----------------------+--------------------------+-----------+----------+---------
deleted | boolean | | |
customer_id | character varying | | not null |
booking_id | character varying | | not null |
first_name | character varying | | |
last_name | character varying | | |
phone | character varying | | |
email | character varying | | |
created_at | timestamp with time zone | | not null | now()
modified_at | timestamp with time zone | | not null | now()
Indexes:
"booking_customer_pkey" PRIMARY KEY, btree (customer_id, booking_id)
"book_cust_idx" btree (booking_id, customer_id)
"idx_booking_customer_full_name" btree (((first_name::text || ' '::text) || last_name::text))
"ix_booking_customer_deleted" btree (deleted)
"ix_booking_customer_email" btree (email)
"ix_booking_customer_first_name" btree (first_name)
"ix_booking_customer_last_name" btree (last_name)
"ix_booking_customer_phone" btree (phone)
"trgm_cust_first_name" gist (first_name gist_trgm_ops)
"trgm_cust_full_name" gist (((first_name::text || ' '::text) || last_name::text) gist_trgm_ops)
"trgm_cust_last_name" gist (last_name gist_trgm_ops)
我正在运行以下查询:
EXPLAIN ANALYZE
SELECT bk.booking_id, bk.created_at, bk.checkin_date
FROM booking bk
WHERE bk.reference_number = '9123889123' OR
EXISTS (
SELECT 1 FROM booking_customer cust
WHERE cust.booking_id = bk.booking_id AND (
cust.email = '9123889123' OR
cust.phone = '9123889123'
) AND
cust.deleted = false
)
ORDER BY bk.checkin_date DESC, bk.created_at DESC
LIMIT 10 OFFSET 0;
这会产生以下查询计划:
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..365.54 rows=10 width=31) (actual time=57.861..865.883 rows=3 loops=1)
-> Index Scan Backward using idx_booking_sort_checkin on booking bk (cost=0.42..14419601.66 rows=394937 width=31) (actual time=57.858..865.877 rows=3 loops=1)
Filter: (((reference_number)::text = '9916092871'::text) OR (alternatives: SubPlan 1 or hashed SubPlan 2))
Rows Removed by Filter: 676681
SubPlan 1
-> Bitmap Heap Scan on booking_customer cust (cost=14.08..18.10 rows=1 width=0) (never executed)
Recheck Cond: (((booking_id)::text = (bk.booking_id)::text) AND (((email)::text = '9916092871'::text) OR ((phone)::text = '9916092871'::text)))
Filter: (NOT deleted)
-> BitmapAnd (cost=14.08..14.08 rows=1 width=0) (never executed)
-> Bitmap Index Scan on book_cust_idx (cost=0.00..4.49 rows=8 width=0) (never executed)
Index Cond: ((booking_id)::text = (bk.booking_id)::text)
-> BitmapOr (cost=9.34..9.34 rows=65 width=0) (never executed)
-> Bitmap Index Scan on ix_booking_customer_email (cost=0.00..4.67 rows=33 width=0) (never executed)
Index Cond: ((email)::text = '9916092871'::text)
-> Bitmap Index Scan on ix_booking_customer_phone (cost=0.00..4.67 rows=32 width=0) (never executed)
Index Cond: ((phone)::text = '9916092871'::text)
SubPlan 2
-> Bitmap Heap Scan on booking_customer cust_1 (cost=9.38..264.83 rows=65 width=32) (actual time=0.047..0.050 rows=3 loops=1)
Recheck Cond: (((email)::text = '9916092871'::text) OR ((phone)::text = '9916092871'::text))
Filter: (NOT deleted)
Heap Blocks: exact=3
-> BitmapOr (cost=9.38..9.38 rows=65 width=0) (actual time=0.042..0.042 rows=0 loops=1)
-> Bitmap Index Scan on ix_booking_customer_email (cost=0.00..4.67 rows=33 width=0) (actual time=0.019..0.019 rows=0 loops=1)
Index Cond: ((email)::text = '9916092871'::text)
-> Bitmap Index Scan on ix_booking_customer_phone (cost=0.00..4.67 rows=32 width=0) (actual time=0.023..0.023 rows=3 loops=1)
Index Cond: ((phone)::text = '9916092871'::text)
Planning time: 0.782 ms
Execution time: 865.956 ms
(28 rows)
如果你看到了,postgres 在reference_number 和booking_id 字段上使用了Filter 谓词,我已经对其进行了索引。
但是,如果我从WHERE 子句中删除OR 条件,它将开始使用索引:
对于这个查询:
EXPLAIN ANALYZE
SELECT bk.booking_id, bk.created_at, bk.checkin_date
FROM booking bk
WHERE bk.reference_number = '9123889123'
ORDER BY bk.checkin_date DESC, bk.created_at DESC
LIMIT 10 OFFSET 0;
查询计划是这样的:
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=8.45..8.46 rows=1 width=31) (actual time=0.021..0.021 rows=0 loops=1)
-> Sort (cost=8.45..8.46 rows=1 width=31) (actual time=0.020..0.020 rows=0 loops=1)
Sort Key: checkin_date DESC, created_at DESC
Sort Method: quicksort Memory: 25kB
-> Index Scan using ix_booking_reference_number on booking bk (cost=0.42..8.44 rows=1 width=31) (actual time=0.014..0.014 rows=0 loops=1)
Index Cond: ((reference_number)::text = '9123889123'::text)
Planning time: 0.334 ms
Execution time: 0.042 ms
我无法理解为什么这种行为会发生变化? reference_number 和 booking_id 具有唯一索引。另外,第一个查询中的那两个子计划是什么?这也会影响查询性能吗?
我在reference_number 上创建了gist 索引,以允许我在其他地方使用的LIKE 查询索引。
有什么可以改变来提高查询性能的吗?
我在预订表中有 50 万条记录,在客户表中有 200 万条记录。
【问题讨论】:
标签: database postgresql performance indexing