改进涉及日期间隔重叠的慢查询

时间:2016-08-02 09:16:32

标签: performance postgresql postgresql-9.5

我有一个查询:

select {{AggregateFunctionsInvokations}}
from {{LargeTable}}
where {{DateIntervalOverlappingFiltering}}
  • 该表格将超过100.000.000行。
  • 目标过滤日期间隔(每个已发出查询中的变量)在每种情况下为1天到10个月。

我的目标是,在最坏的情况下,这些查询不会超过10秒。怎么实现呢?有可能吗?

下面我包含了测试用例的SQL代码。

create schema tmp3;

create table tmp3.Info1(
    emissionId text not null,
    userId text not null,
    minInstant timestamp not null,  
    maxInstant timestamp not null
);

insert into tmp3.Info1(
    emissionId,
    userId,
    minInstant,
    maxInstant
)
select
    (floor(random() * (4000 - 22 + 1)) + 22)::text,
    (floor(random() * (250000 - 33 + 1)) + 33)::text,
    min(least(X.a, X.b)),
    max(greatest(X.a, X.b))
from (
    select
        '2015-07-10 20:00:00'::timestamp + random() * ('2016-08-02 16:00:00'::timestamp - '2015-07-10 20:00:00'::timestamp) as a,
        '2015-07-10 20:00:00'::timestamp + random() * ('2016-08-02 16:00:00'::timestamp - '2015-07-10 20:00:00'::timestamp) as b
    from generate_series(1,45000000)
) X
group by 1, 2;

确保具有唯一对发射+用户的指数。 (注意:现在,对于此测试,不需要,因为之前的“插入”句子确保了唯一性,但是对于我的应用程序的“正常”操作将需要。)

create unique index Info1_UQ001 on tmp3.Info1 using btree (
    emissionId asc,
    userId asc
);

覆盖索引以获得波纹管查询所花费的时间更少。哪个人会使用Postgres策划者?

create index Info1_IX001 on tmp3.Info1 using btree (minInstant asc, maxInstant asc, userId);
create index Info1_IX002 on tmp3.Info1 using btree (minInstant asc, maxInstant desc, userId);
create index Info1_IX003 on tmp3.Info1 using btree (minInstant desc, maxInstant asc, userId);
create index Info1_IX004 on tmp3.Info1 using btree (minInstant desc, maxInstant desc, userId);

VACUUM回收死元组占用的存储空间。 ANALYZE收集有关数据库中表格内容的统计信息[...]随后,查询计划程序 使用这些统计信息来帮助确定最有效的查询执行计划。

vacuum analyze tmp3.Info1

结果:

--*
--Range: 2015-12-10...2016-01-25
--Execution plan performed: "Seq Scan" over "tmp3.Info1" table.
--Total execution time: 5 minutes with 16 seconds.
--*
--Range: 2015-09-10...2015-09-21
--Execution plan performed: "Seq Scan" over "tmp3.Info1" table.
--Total execution time: 2 minutes with 47 seconds

select
    min(extract(epoch from (X.maxInstant - X.minInstant))) as minSessionTime,
    round(avg(extract(epoch from (X.maxInstant - X.minInstant)))) as avgSessionTime,
    max(extract(epoch from (X.maxInstant - X.minInstant))) as maxSessionTime,
    count(distinct X.userId) as numUsers
from tmp3.Info1 X
where
    --http://stackoverflow.com/questions/325933/determine-whether-two-date-ranges-overlap/325964#325964
    --(StartA <= EndB) and (EndA >= StartB)
    --min: "2015-07-10 20:00:00"
    --max: "2016-08-02 15:59:59.624544"
    --(X.minInstant <= @EndB)
    --and
    --(X.maxInstant >= @StartB)    
    X.minInstant <= '2016-01-25'::timestamp
    and
    X.maxInstant >= '2015-12-10'::timestamp

0 个答案:

没有答案