Postgresql ORDER with GROUP BY和数百万行

时间:2015-03-22 17:40:14

标签: postgresql

我正在使用此查询:

SELECT ttt.id_tour_term, tt.id_tour, tt.min FROM (
 SELECT id_tour, MIN(pricefrom) AS min, date_start
 FROM tours_terms 
 WHERE pricefrom > 500
 GROUP BY id_tour, date_start
 ORDER BY min
 ) tt INNER JOIN tours_terms ttt ON tt.id_tour = ttt.id_tour 
 AND tt.min = ttt.pricefrom AND tt.date_start = ttt.date_start
 LIMIT 10;

这是SQL create:

CREATE TABLE "public"."tours_terms" ( 
"id" BIGINT DEFAULT nextval('tours_terms_id_seq'::regclass) NOT NULL UNIQUE, 
"id_tour_term" BIGINT, 
......
"id_tour" BIGINT NOT NULL, 
......
"date_start" Date NOT NULL, 
......
"pricefrom" BIGINT NOT NULL, 
......
PRIMARY KEY ( "id" )
);

CREATE INDEX "index123" ON "public"."tours_terms" USING btree( "id_tour" ASC NULLS LAST, "date_start" ASC NULLS LAST, "pricefrom" ASC NULLS LAST );

EXPLAIN ANALYZE:

Limit (cost=223761.30..552651.91 rows=10 width=24) (actual    time=17221.437..17221.576 rows=10 loops=1)
   -> Merge Join (cost=223761.30..782875.35 rows=17 width=24) (actual    time=17221.435..17221.568 rows=10 loops=1)
   Merge Cond: ((ttt.id_tour = tours_terms.id_tour) AND (ttt.date_start = tours_terms.date_start) AND (ttt.pricefrom = (min(tours_terms.pricefrom))))
-> Index Scan using index123 on tours_terms ttt (cost=0.43..535445.38 rows=2798151 width=28) (actual time=0.026..0.120 rows=10 loops=1)
-> Sort (cost=223760.87..224431.57 rows=268280 width=20) (actual time=17221.382..17221.388 rows=10 loops=1)
Sort Key: tours_terms.id_tour, tours_terms.date_start, (min(tours_terms.pricefrom))
Sort Method: external sort Disk: 46968kB
-> Sort (cost=196217.39..196888.09 rows=268280 width=20) (actual time=10421.422..11989.747 rows=1412964 loops=1)
Sort Key: (min(tours_terms.pricefrom))
Sort Method: external merge Disk: 41424kB
-> GroupAggregate (cost=0.43..172027.42 rows=268280 width=20) (actual time=0.035..4988.960 rows=1412964 loops=1)
-> Index Only Scan using index123 on tours_terms (cost=0.43..149223.67 rows=2682793 width=20) (actual time=0.026..2322.927 rows=2701457 loops=1)
Index Cond: (pricefrom > 500)
Heap Fetches: 441882
Total runtime: 17290.012 ms

版本是9.3.6

请问有人帮我优化这个查询吗?

问题是,我有数百万行的表,其中“id_tour”,“priceform”和“date_start”是重复的,所以我必须按照东西进行分组。

0 个答案:

没有答案