使用order by,left join和limit优化Postgresql查询

时间:2013-07-26 12:24:15

标签: sql postgresql query-optimization

我正在寻找帮助,因为我找不到优化此查询的好方法:

SELECT
b.book_id,
b.asin,
b.type_book
FROM book b
LEFT JOIN product_maj_plateforme pmp
        ON pmp.book_id_fk = b.book_id AND pmp.plateforme_id_fk = 1
WHERE
deleted = 0
AND (
        pmp.book_id_fk IS NULL OR (
        pmp.book_id_fk IS NOT NULL
        AND pmp.date_updated < now() - INTERVAL '1 SECOND' * b.ttl
))
AND asin IS NOT NULL
AND asin != ''
AND asin != '0'
AND price_achat > 0
ORDER BY b.ttl asc
LIMIT 400;

现在,这是我的执行计划:

'Limit  (cost=0.00..8702.51 rows=400 width=20) (actual time=3284.647..3287.175 rows=400 loops=1)'
'  Output: b.book_id, b.asin, b.type_book, b.ttl'
'  ->  Nested Loop Left Join  (cost=0.00..4499522.06 rows=206815 width=20) (actual time=3284.647..3287.128 rows=400 loops=1)'
'        Output: b.book_id, b.asin, b.type_book, b.ttl'
'        Filter: ((pmp.book_id_fk IS NULL) OR ((pmp.book_id_fk IS NOT NULL) AND (pmp.date_updated < (now() - ('00:00:01'::interval * (b.ttl)::double precision)))))'
'        Rows Removed by Filter: 631559'
'        ->  Index Scan using book_ix_test on public.book l  (cost=0.00..324455.62 rows=620444 width=20) (actual time=0.014..893.522 rows=631959 loops=1)'
'              Output: b.book_id, b.code, b.date_created, b.date_updated, b.titre, b.deleted, b.encours, b.date_encours, b.prix_achat, b.prix_vente, b.marchand_id_achat, b.etat_id_achat, b.poids, b.prix_vente_max, b.prix_vente_conc, b.asin, b.item_condition, b.type_book, b.author, b.edition, b.quantity, b.ttl'
'              Filter: ((b.asin IS NOT NULL) AND ((b.asin)::text <> ''::text) AND ((b.asin)::text <> '0'::text) AND (b.prix_achat > 0::double precision) AND (b.deleted = 0) AND (b.encours = 0))'
'              Rows Removed by Filter: 441235'
'        ->  Index Scan using produit_maj_plateforme_uniq on public.produit_maj_plateforme pmp  (cost=0.00..6.71 rows=1 width=12) (actual time=0.003..0.003 rows=1 loops=631959)'
'              Output: pmp.book_id_fk, pmp.date_updated'
'              Index Cond: ((pmp.book_id_fk = b.book_id) AND (pmp.plateforme_id_fk = 1))'
'Total runtime: 3287.333 ms'

我认为问题是嵌套循环左连接,但我找不到避免它的方法。

此查询大约需要10秒才能在我的专用32Gb + ssd服务器上执行

提前致谢

编辑:

Postgresql版本:9.3(在Debian上)

表格书:~1200000行

CREATE TABLE book
(
  book_id serial NOT NULL,
  code character varying(255) NOT NULL,
  date_created timestamp without time zone NOT NULL,
  date_updated timestamp without time zone,
  titre character varying(1000),
  deleted smallint DEFAULT 0,
  encours smallint NOT NULL DEFAULT 0,
  date_encours timestamp without time zone,
  price_achat double precision NOT NULL DEFAULT 0,
  price_vente double precision NOT NULL DEFAULT 0,
  marchand_id_achat integer,
  etat_id_achat integer,
  poids double precision,
  price_vente_max double precision NOT NULL DEFAULT 0,
  price_vente_conc double precision NOT NULL DEFAULT 0,
  asin character varying(255),
  item_condition smallint,
  type_livre smallint NOT NULL DEFAULT 0,
  author character varying(250),
  edition character varying(250),
  quantity smallint NOT NULL DEFAULT 1,
  ttl integer DEFAULT 64800,
  CONSTRAINT livre_pkey PRIMARY KEY (livre_id)
);

CREATE INDEX book_ix_get_to_be_checked_2
  ON book
  USING btree
  (type_livre DESC, ttl);

CREATE INDEX book_ix_test
  ON book
  USING btree
  (ttl);

表:~1200000行×6

CREATE TABLE product_maj_plateforme
(
  product_maj_plateforme_id serial NOT NULL,
  book_id_fk integer NOT NULL,
  plateforme_id_fk integer NOT NULL,
  date_updated timestamp without time zone NOT NULL,
  CONSTRAINT produit_maj_plateforme_pkey PRIMARY KEY (produit_maj_plateforme_id),
  CONSTRAINT book_id_fk_key FOREIGN KEY (book_id_fk)
      REFERENCES book (book_id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE RESTRICT,
  CONSTRAINT plateforme_id_fk_key FOREIGN KEY (plateforme_id_fk)
      REFERENCES plateforme (plateforme_id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE RESTRICT
)

CREATE UNIQUE INDEX produit_maj_plateforme_uniq
  ON produit_maj_plateforme
  USING btree
  (livre_id_fk, plateforme_id_fk);

不要担心拼写错误的专栏,我试图从法语翻译成英语以便更好地理解,但它可能会遗漏一些变化

2 个答案:

答案 0 :(得分:0)

假设已删除,asin和price_achat是您的图书表格的一部分。 这更快吗?

SELECT
b.book_id,
b.asin,
b.type_book
FROM (select * from book where deleted = 0 and asin IS NOT NULL AND asin != '' AND asin != '0' AND price_achat > 0) b
LEFT JOIN product_maj_plateforme pmp
        ON pmp.book_id_fk = b.book_id AND pmp.plateforme_id_fk = 1
WHERE

        pmp.book_id_fk IS NULL OR 
        (
          pmp.book_id_fk IS NOT NULL
          AND pmp.date_updated < now() - INTERVAL '1 SECOND' * b.ttl
        )
ORDER BY b.ttl asc
LIMIT 400;

答案 1 :(得分:0)

尝试类似:

SELECT
b.book_id,
b.asin,
b.type_book
FROM book b
WHERE deleted = 0          
AND asin != ''
AND asin != '0'
AND price_achat > 0
AND NOT EXISTS (SELECT 1
                FROM product_maj_plateforme pmp
                WHERE pmp.book_id_fk = b.book_id AND pmp.plateforme_id_fk = 1
                  AND pmp.date_updated > now() - INTERVAL '1 SECOND' * b.ttl)

ORDER BY b.ttl asc
LIMIT 400;

它略微改变了查询的逻辑,但可能适合您并且更快。

加快查询速度的其他方法是在(ttl, deleted, asin)(ttl, deleted, price_achat)上创建综合索引