为什么只使用一个索引

时间:2018-03-21 07:25:33

标签: postgresql indexing postgresql-performance

我有一张桌子

CREATE TABLE timedevent
(
  id bigint NOT NULL,
  eventdate timestamp with time zone NOT NULL,
  newstateids character varying(255) NOT NULL,
  sourceid character varying(255) NOT NULL,
  CONSTRAINT timedevent_pkey PRIMARY KEY (id)
) WITH (OIDS=FALSE);

使用PK id

我必须在两个日期之间查询某些新状态和来自一组可能来源的来源。

我在eventdatenewstateids上创建了一个btree索引,在sourceid上创建了一个(哈希索引)。只有date上的索引使查询更快 - 似乎其他两个未被使用。为什么会这样?我怎样才能更快地查询?

CREATE INDEX eventdate_index     ON timedevent USING btree (eventdate);
CREATE INDEX newstateids_index   ON timedevent USING btree (newstateids COLLATE pg_catalog."default");
CREATE INDEX sourceid_index_hash ON timedevent USING hash  (sourceid COLLATE pg_catalog."default");

这是Hibernate生成它的查询:

select this_.id as id1_0_0_, this_.description as descript2_0_0_, this_.eventDate as eventDat3_0_0_, this_.locationId as location4_0_0_, this_.newStateIds as newState5_0_0_, this_.oldStateIds as oldState6_0_0_, this_.sourceId as sourceId7_0_0_ 
from TimedEvent this_
where ((this_.newStateIds=? and this_.sourceId in (?, ?, ?, ?, ?, ?)))
    and this_.eventDate between ? and ?
    limit ?

编辑:
很抱歉有误导性的标题,但似乎邮政使用所有索引。问题是我的查询时间仍然保持不变。这是我得到的查询计划:

Limit  (cost=25130.29..33155.77 rows=321 width=161) (actual time=705.330..706.744 rows=279 loops=1)
  Buffers: shared hit=6 read=8167 written=61
  ->  Bitmap Heap Scan on timedevent this_  (cost=25130.29..33155.77 rows=321 width=161) (actual time=705.330..706.728 rows=279 loops=1)
        Recheck Cond: (((sourceid)::text = ANY ('{"root,kus-chemnitz,ize-159,Anwesend Bad","root,kus-chemnitz,ize-159,Alarmruf","root,kus-chemnitz,ize-159,Bett Alarm 1","root,kus-chemnitz,ize-159,Bett Alarm 2","root,kus-chemnitz,ize-159,Anwesend Zimmer" (...)
        Filter: ((eventdate >= '2017-11-01 15:41:00+01'::timestamp with time zone) AND (eventdate <= '2018-03-20 14:58:16.724+01'::timestamp with time zone))
        Buffers: shared hit=6 read=8167 written=61
        ->  BitmapAnd  (cost=25130.29..25130.29 rows=2122 width=0) (actual time=232.990..232.990 rows=0 loops=1)
              Buffers: shared hit=6 read=2152
              ->  Bitmap Index Scan on sourceid_index_hash  (cost=0.00..1403.36 rows=39182 width=0) (actual time=1.195..1.195 rows=9308 loops=1)
                    Index Cond: ((sourceid)::text = ANY ('{"root,kus-chemnitz,ize-159,Anwesend Bad","root,kus-chemnitz,ize-159,Alarmruf","root,kus-chemnitz,ize-159,Bett Alarm 1","root,kus-chemnitz,ize-159,Bett Alarm 2","root,kus-chemnitz,ize-159,Anwesend Z (...)
                    Buffers: shared hit=6 read=26
              ->  Bitmap Index Scan on state_index  (cost=0.00..23726.53 rows=777463 width=0) (actual time=231.160..231.160 rows=776520 loops=1)
                    Index Cond: ((newstateids)::text = 'ACTIV'::text)
                    Buffers: shared read=2126
Total runtime: 706.804 ms

使用btree(sourceid,newstateids)创建索引作为a_horse_with_no_name建议后,降低了成本:

Limit  (cost=125.03..8150.52 rows=321 width=161) (actual time=13.611..14.454 rows=279 loops=1)
  Buffers: shared hit=18 read=4336
  ->  Bitmap Heap Scan on timedevent this_  (cost=125.03..8150.52 rows=321 width=161) (actual time=13.609..14.432 rows=279 loops=1)
        Recheck Cond: (((sourceid)::text = ANY ('{"root,kus-chemnitz,ize-159,Anwesend Bad","root,kus-chemnitz,ize-159,Alarmruf","root,kus-chemnitz,ize-159,Bett Alarm 1","root,kus-chemnitz,ize-159,Bett Alarm 2","root,kus-chemnitz,ize-159,Anwesend Zimmer","r (...)
        Filter: ((eventdate >= '2017-11-01 15:41:00+01'::timestamp with time zone) AND (eventdate <= '2018-03-20 14:58:16.724+01'::timestamp with time zone))
        Buffers: shared hit=18 read=4336
        ->  Bitmap Index Scan on src_state_index  (cost=0.00..124.95 rows=2122 width=0) (actual time=0.864..0.864 rows=4526 loops=1)
              Index Cond: (((sourceid)::text = ANY ('{"root,kus-chemnitz,ize-159,Anwesend Bad","root,kus-chemnitz,ize-159,Alarmruf","root,kus-chemnitz,ize-159,Bett Alarm 1","root,kus-chemnitz,ize-159,Bett Alarm 2","root,kus-chemnitz,ize-159,Anwesend Zimmer (...)
              Buffers: shared hit=18 read=44
Total runtime: 14.497 ms"

1 个答案:

答案 0 :(得分:0)

基本上只使用一个索引,因为数据库必须将索引合并为一个以便它们有用(或者将搜索结果与更多索引相结合)并且这样做非常昂贵,在这种情况下它选择不并仅使用与一个谓词相关的索引之一,并直接在找到的行中检查其他谓词。

一个包含多个列的B树索引可以更好地工作,就像a_horse_with_no_name在评论中建议 一样。另请注意,列的顺序很重要(用于单值搜索的列应该是第一个,稍后用于范围搜索的列,您希望尽可能限制范围搜索)。 然后数据库将通过索引,使用索引的第一列(希望大量缩小行数)来寻找满足谓词的行,并且第二列和第二个谓词起作用,...

当使用AND运算符组合谓词时使用单独的B树索引对数据库没有意义,因为它必须使用一个索引来选择满足一个谓词的所有行,然后,它必须使用另一个索引,再次从磁盘读取其块(存储索引的位置),仅获取满足与第二个索引相关的条件的行,但可能不是其他条件。如果他们满足它,在首次使用索引后加载行可能会更便宜,并直接检查其他谓词,而不是使用索引。