【发布时间】:2018-03-21 07:25:32
【问题描述】:
我有一张桌子
CREATE TABLE timedevent
(
id bigint NOT NULL,
eventdate timestamp with time zone NOT NULL,
newstateids character varying(255) NOT NULL,
sourceid character varying(255) NOT NULL,
CONSTRAINT timedevent_pkey PRIMARY KEY (id)
) WITH (OIDS=FALSE);
与PKid。
我必须从一组可能的来源中查询具有特定新状态和来源的两个日期之间的行。
我在eventdate 和newstateids 上创建了一个btree 索引,并在sourceid 上创建了一个(哈希索引)。只有date 上的索引使查询更快 - 似乎其他两个没有使用。为什么呢?我怎样才能使我的查询更快?
CREATE INDEX eventdate_index ON timedevent USING btree (eventdate);
CREATE INDEX newstateids_index ON timedevent USING btree (newstateids COLLATE pg_catalog."default");
CREATE INDEX sourceid_index_hash ON timedevent USING hash (sourceid COLLATE pg_catalog."default");
这是 Hibernate 生成的查询:
select this_.id as id1_0_0_, this_.description as descript2_0_0_, this_.eventDate as eventDat3_0_0_, this_.locationId as location4_0_0_, this_.newStateIds as newState5_0_0_, this_.oldStateIds as oldState6_0_0_, this_.sourceId as sourceId7_0_0_
from TimedEvent this_
where ((this_.newStateIds=? and this_.sourceId in (?, ?, ?, ?, ?, ?)))
and this_.eventDate between ? and ?
limit ?
编辑:
抱歉标题具有误导性,但似乎 postges 使用了所有索引。问题是我的查询时间仍然保持不变。这是我得到的查询计划:
Limit (cost=25130.29..33155.77 rows=321 width=161) (actual time=705.330..706.744 rows=279 loops=1)
Buffers: shared hit=6 read=8167 written=61
-> Bitmap Heap Scan on timedevent this_ (cost=25130.29..33155.77 rows=321 width=161) (actual time=705.330..706.728 rows=279 loops=1)
Recheck Cond: (((sourceid)::text = ANY ('{"root,kus-chemnitz,ize-159,Anwesend Bad","root,kus-chemnitz,ize-159,Alarmruf","root,kus-chemnitz,ize-159,Bett Alarm 1","root,kus-chemnitz,ize-159,Bett Alarm 2","root,kus-chemnitz,ize-159,Anwesend Zimmer" (...)
Filter: ((eventdate >= '2017-11-01 15:41:00+01'::timestamp with time zone) AND (eventdate <= '2018-03-20 14:58:16.724+01'::timestamp with time zone))
Buffers: shared hit=6 read=8167 written=61
-> BitmapAnd (cost=25130.29..25130.29 rows=2122 width=0) (actual time=232.990..232.990 rows=0 loops=1)
Buffers: shared hit=6 read=2152
-> Bitmap Index Scan on sourceid_index_hash (cost=0.00..1403.36 rows=39182 width=0) (actual time=1.195..1.195 rows=9308 loops=1)
Index Cond: ((sourceid)::text = ANY ('{"root,kus-chemnitz,ize-159,Anwesend Bad","root,kus-chemnitz,ize-159,Alarmruf","root,kus-chemnitz,ize-159,Bett Alarm 1","root,kus-chemnitz,ize-159,Bett Alarm 2","root,kus-chemnitz,ize-159,Anwesend Z (...)
Buffers: shared hit=6 read=26
-> Bitmap Index Scan on state_index (cost=0.00..23726.53 rows=777463 width=0) (actual time=231.160..231.160 rows=776520 loops=1)
Index Cond: ((newstateids)::text = 'ACTIV'::text)
Buffers: shared read=2126
Total runtime: 706.804 ms
按照 a_horse_with_no_name 的建议使用 btree on (sourceid, newstateids) 创建索引后,成本降低了:
Limit (cost=125.03..8150.52 rows=321 width=161) (actual time=13.611..14.454 rows=279 loops=1)
Buffers: shared hit=18 read=4336
-> Bitmap Heap Scan on timedevent this_ (cost=125.03..8150.52 rows=321 width=161) (actual time=13.609..14.432 rows=279 loops=1)
Recheck Cond: (((sourceid)::text = ANY ('{"root,kus-chemnitz,ize-159,Anwesend Bad","root,kus-chemnitz,ize-159,Alarmruf","root,kus-chemnitz,ize-159,Bett Alarm 1","root,kus-chemnitz,ize-159,Bett Alarm 2","root,kus-chemnitz,ize-159,Anwesend Zimmer","r (...)
Filter: ((eventdate >= '2017-11-01 15:41:00+01'::timestamp with time zone) AND (eventdate <= '2018-03-20 14:58:16.724+01'::timestamp with time zone))
Buffers: shared hit=18 read=4336
-> Bitmap Index Scan on src_state_index (cost=0.00..124.95 rows=2122 width=0) (actual time=0.864..0.864 rows=4526 loops=1)
Index Cond: (((sourceid)::text = ANY ('{"root,kus-chemnitz,ize-159,Anwesend Bad","root,kus-chemnitz,ize-159,Alarmruf","root,kus-chemnitz,ize-159,Bett Alarm 1","root,kus-chemnitz,ize-159,Bett Alarm 2","root,kus-chemnitz,ize-159,Anwesend Zimmer (...)
Buffers: shared hit=18 read=44
Total runtime: 14.497 ms"
【问题讨论】:
-
请edit您的问题并添加您使用的确切
create index语句和使用explain (analyze, buffers)生成的执行计划。 Formatted text 请no screen shots -
@zerkms:这不是真的。
-
Postgres 确实有可能使用例如
IN条件的位图索引扫描和between条件的另一个索引。这是否有意义以及计划者是否真正选择这样做取决于许多不同的因素。但是“只使用一个索引”的一般说法是错误的。 -
(newstateids, sourceid)上的单个(btree)索引可能是比两个单列索引更好的选择 -
@zerkms: 是的,如果这意味着创建更少或更小的索引(例如,因为这些索引可以用于更多查询,而不仅仅是一个)
标签: postgresql indexing postgresql-performance