如何在主键约束上唯一索引?

时间:2019-04-11 12:54:28

标签: postgresql indexing

我有一个分区表,并在其上创建了唯一索引。

我正在尝试运行一些查询,其中一些使用主键约束,而另一些使用我创建的索引。我希望查询使用唯一索引而不是主要约束。

我尝试重新编制索引,但是没有用。

这是两个查询

1)在这里,我创建的索引正在被使用。

查询计划是:

Finalize Aggregate  (cost=296958.94..296958.95 rows=1 width=8) (actual time=927.948..927.948 rows=1 loops=1)
->  Gather  (cost=296958.72..296958.93 rows=2 width=8) (actual time=927.887..933.730 rows=3 loops=1)
     Workers Planned: 2
     Workers Launched: 2
     ->  Partial Aggregate  (cost=295958.72..295958.73 rows=1 width=8) actual time=924.885..924.885 rows=1 loops=3)
           ->  Parallel Append  (cost=0.68..293370.57 rows=1035261 width=8) (actual time=0.076..852.758 rows=825334 loops=3)
                 ->  Parallel Index Only Scan using testdate2019jan_april_cost_mo_user_id_account_id_resource__idx5 on testdate2019jan_april_cost_mod3rem2  (cost=0.68..146591.56 rows=525490 width=8) (actual time=0.082..388.130 rows=421251 loops=3)
                       Index Cond: (user_id = 1)
                       Heap Fetches: 3922
                 ->  Parallel Index Only Scan using testdate2018sept_dec_cost_mod_user_id_account_id_resource__idx5 on testdate2018sept_dec_cost_mod3rem2  (cost=0.68..141570.15 rows=509767 width=8) (actual time=0.057..551.572 rows=606125 loops=2)
                       Index Cond: (user_id = 1)
                       Heap Fetches: 0
                 ->  Parallel Index Scan using testdate2018jan_april_cost_mo_account_id_user_id_resource__idx2 on testdate2018jan_april_cost_mod3rem2  (cost=0.12..8.14 rows=1 width=8) (actual time=0.001..0.001 rows=0 loops=1)
                       Index Cond: (user_id = 1)
                 ->  Parallel Index Scan using testdate2018may_august_cost_m_account_id_user_id_resource__idx1 on testdate2018may_august_cost_mod3rem2  (cost=0.12..8.14 rows=1 width=8) (actual time=0.001..0.001 rows=0 loops=1)
                       Index Cond: (user_id = 1)
                 ->  Parallel Index Scan using testdate2019may_august_cost_m_account_id_user_id_resource__idx2 on testdate2019may_august_cost_mod3rem2  (cost=0.12..8.14 rows=1 width=8)    (actual time=0.002..0.002 rows=0 loops=1)
                       Index Cond: (user_id = 1)
                 ->  Parallel Index Scan using testdate2019sept_dec_cost_mod_account_id_user_id_resource__idx2 on testdate2019sept_dec_cost_mod3rem2  (cost=0.12..8.14 rows=1 width=8) (actual time=0.003..0.003 rows=0 loops=1)
                       Index Cond: (user_id = 1)
 Planning Time: 0.754 ms
 Execution Time: 933.797 ms

在上面的查询中,我的索引testdate2018may_august_cost_m_account_id_user_id_resource__idx1的使用如我所愿。

2)这里没有使用我创建的索引,而是使用了主要约束。

 Sort Method: quicksort  Memory: 25kB
 Buffers: shared hit=2 read=66080
 ->  Finalize GroupAggregate  (cost=388046.40..388187.55 rows=6 width=61) (actual time=510.710..513.262 rows=10 loops=1)
     Group Key: c_1.instance_type, c_1.currency
     Buffers: shared hit=2 read=66080
     ->  Gather Merge  (cost=388046.40..388187.24 rows=12 width=85) (actual time=510.689..513.303 rows=28 loops=1)
           Workers Planned: 2
           Workers Launched: 2
           Buffers: shared hit=26 read=206407
           ->  Partial GroupAggregate  (cost=387046.38..387185.83 rows=6 width=85) (actual time=504.731..507.277 rows=9 loops=3)
                 Group Key: c_1.instance_type, c_1.currency
                 Buffers: shared hit=26 read=206407
                 ->  Sort  (cost=387046.38..387056.71 rows=4130 width=36) (actual time=504.694..504.933 rows=3895 loops=3)
                       Sort Key: c_1.instance_type, c_1.currency
                       Sort Method: quicksort  Memory: 404kB
                       Worker 0:  Sort Method: quicksort  Memory: 354kB
                       Worker 1:  Sort Method: quicksort  Memory: 541kB
                       Buffers: shared hit=20 read=206407
                       ->  Parallel Append  (cost=0.13..386798.33 rows=4130 width=36) (actual time=0.081..501.720 rows=3895 loops=3)
                             Buffers: shared hit=6 read=206405
                             Subplans Removed: 3
                             ->  Parallel Index Scan using testdate2019may_august_cost_mod3rem2_pkey on testdate2019may_august_cost_mod3rem2 c_1  (cost=0.13..8.15 rows=1 width=36) (actual time=0.008..0.008 rows=0 loops=1)
                                   Index Cond: ((usage_start_date >= (CURRENT_DATE - 30)) AND (user_id = '1'::bigint))
                                   Filter: ((instance_type IS NOT NULL) AND ((account_id)::text = '807331824280'::text) AND (usage_end_date <= CURRENT_DATE))
                                   Buffers: shared hit=1
                             ->  Parallel Index Scan using testdate2019sept_dec_cost_mod3rem2_pkey on testdate2019sept_dec_cost_mod3rem2 c_2  (cost=0.13..8.15 rows=1 width=36) (actual time=0.006..0.006 rows=0 loops=1)
                                   Index Cond: ((usage_start_date >= (CURRENT_DATE - 30)) AND (user_id = '1'::bigint))
                                   Filter: ((instance_type IS NOT NULL) AND ((account_id)::text = '807331824280'::text) AND (usage_end_date <= CURRENT_DATE))
                                   Buffers: shared hit=1
                             ->  Parallel Seq Scan on testdate2019jan_april_cost_mod3rem2 c  (cost=0.00..258266.58 rows=4125 width=36) (actual time=0.076..501.060 rows=3895 loops=3)
                                   Filter: ((instance_type IS NOT NULL) AND (user_id = '1'::bigint) AND ((account_id)::text = '807331824280'::text) AND (usage_end_date <= CURRENT_DATE) AND (usage_start_date >= (CURRENT_DATE - 30)))
                                   Rows Removed by Filter: 1504689
                                   Buffers: shared hit=4 read=206405
Planning Time: 1.290 ms
Execution Time: 513.439 ms

在上面的查询testdate2019sept_dec_cost_mod3rem2_pkey中,它是主要的约束条件。

我希望它使用我创建的索引而不是主要约束。 根据分区,我的第二个查询计划是否正确?

表创建查询:

CREATE TABLE a2i.testawscost_line_item (
    line_item_id uuid NOT NULL,
    account_id character varying(255) COLLATE pg_catalog."default",
    availability_zone character varying(255) COLLATE pg_catalog."default",
    base_cost double precision,
    base_rate double precision,
    cost double precision,
    currency character varying(255) COLLATE pg_catalog."default",
    instance_family character varying(255) COLLATE pg_catalog."default",
    instance_type character varying(255) COLLATE pg_catalog."default",
    line_item_type character varying(255) COLLATE pg_catalog."default",
    operating_system character varying(255) COLLATE pg_catalog."default",
    operation character varying(255) COLLATE pg_catalog."default",
    payer_account_id character varying(255) COLLATE pg_catalog."default",
    product_code character varying(255) COLLATE pg_catalog."default",
    product_family character varying(255) COLLATE pg_catalog."default",
    product_group character varying(255) COLLATE pg_catalog."default",
    product_name character varying(255) COLLATE pg_catalog."default",
    rate double precision,
    rate_description character varying(255) COLLATE pg_catalog."default",
    reservation_id character varying(255) COLLATE pg_catalog."default",
    resource_type character varying(255) COLLATE pg_catalog."default",
    sku character varying(255) COLLATE pg_catalog."default",
    tax_type character varying(255) COLLATE pg_catalog."default",
    unit character varying(255) COLLATE pg_catalog."default",
    usage_end_date timestamp without time zone,
    usage_quantity double precision,
    usage_start_date timestamp without time zone,
    usage_type character varying(255) COLLATE pg_catalog."default",
    user_id bigint,
    resource_id character varying(255) COLLATE pg_catalog."default",
    CONSTRAINT testawscost_line_item_pkey PRIMARY KEY 
        (line_item_id, usage_start_date, user_id),
    CONSTRAINT fkptp4hyur3i4yj88wo3rxnaf05 FOREIGN KEY (resource_id)
        REFERENCES a2i.awscost_resource (resource_id) MATCH SIMPLE
            ON UPDATE NO ACTION
            ON DELETE NO ACTION
) PARTITION BY hash(user_id);

分区:

create table a2i.testuser_cost_mod3rem0
    partition of a2i.testawscost_line_item
        for values with (MODULUS 3, REMAINDER 0) 
    partition by range(usage_start_date);

create table a2i.testuser_cost_mod3rem1
    partition of a2i.testawscost_line_item
        for values with (MODULUS 3, REMAINDER 1) 
    partition by range(usage_start_date);

create table a2i.testuser_cost_mod3rem2
    partition of a2i.testawscost_line_item
        for values with (MODULUS 3, REMAINDER 2)
    partition by range(usage_start_date);

2019年分区的分区:

create table a2i.testdate2019jan_april_cost_mod3rem0
    partition of a2i.testuser_cost_mod3rem0
        for values from ('2019-01-01 00:00:00') to ('2019-05-01 00:00:00');

create table a2i.testdate2019may_august_cost_mod3rem0
    partition of a2i.testuser_cost_mod3rem0
        for values from ('2019-05-01 00:00:00') to ('2019-09-01 00:00:00');

create table a2i.testdate2019sept_dec_cost_mod3rem0
    partition of a2i.testuser_cost_mod3rem0
        for values from ('2019-09-01 00:00:00') to ('2020-01-01 00:00:00');

create table a2i.testdate2019jan_april_cost_mod3rem1
    partition of a2i.testuser_cost_mod3rem1
        for values from ('2019-01-01 00:00:00') to ('2019-05-01 00:00:00');

create table a2i.testdate2019may_august_cost_mod3rem1
    partition of a2i.testuser_cost_mod3rem1
        for values from ('2019-05-01 00:00:00') to ('2019-09-01 00:00:00');

create table a2i.testdate2019sept_dec_cost_mod3rem1
    partition of a2i.testuser_cost_mod3rem1
        for values from ('2019-09-01 00:00:00') to ('2020-01-01 00:00:00');

create table a2i.testdate2019jan_april_cost_mod3rem2
    partition of a2i.testuser_cost_mod3rem2
        for values from ('2019-01-01 00:00:00') to ('2019-05-01 00:00:00');

create table a2i.testdate2019may_august_cost_mod3rem2
    partition of a2i.testuser_cost_mod3rem2
        for values from ('2019-05-01 00:00:00') to ('2019-09-01 00:00:00');

create table a2i.testdate2019sept_dec_cost_mod3rem2
    partition of a2i.testuser_cost_mod3rem2
        for values from ('2019-09-01 00:00:00') to ('2020-01-01 00:00:00');

索引:

CREATE UNIQUE INDEX awscost_line_item_unique_pkey ON a2i.awscost_line_item (
    account_id, user_id, resource_id, usage_start_date, usage_end_date, usage_type,
    usage_quantity, line_item_type, sku, rate, base_rate, base_cost,
    "cost", currency, product_code, operation
);

对于第一个查询计划,查询为:

  explain analyze select sum(cost) from testawscost_line_item where 
  user_id='1';

第二个查询:

 explain (analyze/*, buffers*/) SELECT sum (c.cost),
 sum (case when c.resource_type = 'Compute' then c.cost end) as computeCost,
 sum (case when c.resource_type = 'Storage' then c.cost end) as storageCost,
 sum (case when c.resource_type = 'Network' then c.cost end) as networkCost,
 sum (case when c.resource_type not in ('Compute', 'Network', 'Storage') 
 then c.cost end) as otherCost,
 c.currency,
 c.instance_type as productFamily,
 avg (c.rate) FROM testawscost_line_item c WHERE
 (c.user_id ='1') AND (c.account_id = '807331824280') AND
 (c.usage_start_date >= current_date-30 AND c.usage_end_date <= 
 current_date) AND
 (c.instance_type is not null )
 GROUP BY c.instance_type, c.currency
 ORDER BY 1 desc

1 个答案:

答案 0 :(得分:1)

问题在于您的索引对于第一个查询有效,但不适用于第二个查询。

resource_id在索引中,但不在查询中,因此之后的所有索引列均不可用于查询。 PostgreSQL决定使用小得多的主键索引。

此查询的理想索引是:

CREATE INDEX ON a2i.testawscost_line_item (user_id, account_id, usage_start_date)
   WHERE instance_type IS NOT NULL;

我假设usage_end_date上的条件比usage_start_date上的条件具有更高的选择性。