优化PG查询

时间:2012-07-12 04:00:47

标签: sql postgresql optimization pg

该查询经常在应用程序中使用,而且过于昂贵。

我可以做些什么来优化它并将总时间带到毫秒(而不是几百毫秒)?

注意:

  • 删除DISTINCT改进(低至约460毫秒),但我需要摆脱笛卡尔积:((是的,显示更好的避免它的方法)
  • 删除OREDER BY name会有所改善,但不会显着。

查询:

SELECT DISTINCT properties.*
FROM properties JOIN developments ON developments.id = properties.development_id

 -- Development allocations
 LEFT JOIN allocation_items   AS dev_items  ON dev_items.development_id = properties.development_id
 LEFT JOIN allocations        AS dev_allocs ON dev_items.allocation_id = dev_allocs.id

 -- Group allocations
 LEFT JOIN properties_property_groupings ppg  ON ppg.property_id = properties.id
 LEFT JOIN property_groupings pg              ON pg.id = ppg.property_grouping_id
 LEFT JOIN allocation_items prop_items        ON prop_items.property_grouping_id = pg.id
 LEFT JOIN allocations prop_allocs            ON prop_allocs.id = prop_items.allocation_id

WHERE
  (properties.status <> 'deleted') AND ((
    properties.status <> 'inactive'
    AND (
     (dev_allocs.receiving_company_id = 175 OR prop_allocs.receiving_company_id = 175)
     AND developments.status = 'active'
    )
    OR developments.company_id = 175
   )
   AND EXISTS (
     SELECT 1 FROM development_participations dp
     JOIN participations p ON p.id = dp.participation_id
     WHERE dp.allowed
       AND p.user_id = 387 AND p.company_id = 175
       AND dp.development_id = properties.development_id
     LIMIT 1
   )
)
ORDER BY properties.name

EXPLAIN ANALYZE

 Unique  (cost=72336.86..72517.53 rows=1606 width=4336) (actual time=703.766..710.920 rows=219 loops=1)
   ->  Sort  (cost=72336.86..72340.87 rows=1606 width=4336) (actual time=703.765..704.698 rows=5091 loops=1)
         Sort Key: properties.name, properties.id, properties.status, properties.level, etc etc (all columns)
         Sort Method: external sort  Disk: 1000kB
         ->  Nested Loop Left Join  (cost=0.00..69258.84 rows=1606 width=4336) (actual time=25.230..366.489 rows=5091 loops=1)
               Filter: ((((properties.status)::text <> 'inactive'::text) AND ((dev_allocs.receiving_company_id = 175) OR (prop_allocs.receiving_company_id = 175)) AND ((developments.status)::text = 'active'::text)) OR (developments.company_id = 175))
               ->  Nested Loop Left Join  (cost=0.00..57036.99 rows=41718 width=4355) (actual time=25.122..247.587 rows=99567 loops=1)
                     ->  Nested Loop Left Join  (cost=0.00..47616.39 rows=21766 width=4355) (actual time=25.111..163.827 rows=39774 loops=1)
                           ->  Nested Loop Left Join  (cost=0.00..41508.16 rows=21766 width=4355) (actual time=25.101..112.452 rows=39774 loops=1)
                                 ->  Nested Loop Left Join  (cost=0.00..34725.22 rows=21766 width=4351) (actual time=25.087..68.104 rows=19887 loops=1)
                                       ->  Nested Loop Left Join  (cost=0.00..28613.00 rows=21766 width=4351) (actual time=25.076..39.360 rows=19887 loops=1)
                                             ->  Nested Loop  (cost=0.00..27478.54 rows=1147 width=4347) (actual time=25.059..29.966 rows=259 loops=1)
                                                   ->  Index Scan using developments_pkey on developments  (cost=0.00..25.17 rows=49 width=15) (actual time=0.048..0.127 rows=48 loops=1)
                                                         Filter: (((status)::text = 'active'::text) OR (company_id = 175))
                                                   ->  Index Scan using index_properties_on_development_id on properties  (cost=0.00..559.95 rows=26 width=4336) (actual time=0.534..0.618 rows=5 loops=48)
                                                         Index Cond: (development_id = developments.id)
                                                         Filter: (((status)::text <> 'deleted'::text) AND (SubPlan 1))
                                                         SubPlan 1
                                                           ->  Limit  (cost=0.00..10.00 rows=1 width=0) (actual time=0.011..0.011 rows=0 loops=2420)
                                                                 ->  Nested Loop  (cost=0.00..10.00 rows=1 width=0) (actual time=0.011..0.011 rows=0 loops=2420)
                                                                       Join Filter: (dp.participation_id = p.id)
                                                                       ->  Seq Scan on development_participations dp  (cost=0.00..1.71 rows=1 width=4) (actual time=0.004..0.008 rows=1 loops=2420)
                                                                             Filter: (allowed AND (development_id = properties.development_id))
                                                                       ->  Index Scan using index_participations_on_user_id on participations p  (cost=0.00..8.27 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=3148)
                                                                             Index Cond: (user_id = 387)
                                                                             Filter: (company_id = 175)
                                             ->  Index Scan using index_allocation_items_on_development_id on allocation_items dev_items  (cost=0.00..0.70 rows=23 width=8) (actual time=0.003..0.016 rows=77 loops=259)
                                                   Index Cond: (development_id = properties.development_id)
                                       ->  Index Scan using allocations_pkey on allocations dev_allocs  (cost=0.00..0.27 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=19887)
                                             Index Cond: (dev_items.allocation_id = id)
                                 ->  Index Scan using index_properties_property_groupings_on_property_id on properties_property_groupings ppg  (cost=0.00..0.29 rows=2 width=8) (actual time=0.001..0.001 rows=2 loops=19887)
                                       Index Cond: (property_id = properties.id)
                           ->  Index Scan using property_groupings_pkey on property_groupings pg  (cost=0.00..0.27 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=39774)
                                 Index Cond: (id = ppg.property_grouping_id)
                     ->  Index Scan using index_allocation_items_on_property_grouping_id on allocation_items prop_items  (cost=0.00..0.36 rows=6 width=8) (actual time=0.001..0.001 rows=2 loops=39774)
                           Index Cond: (property_grouping_id = pg.id)
               ->  Index Scan using allocations_pkey on allocations prop_allocs  (cost=0.00..0.27 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=99567)
                     Index Cond: (id = prop_items.allocation_id)
 Total runtime: 716.692 ms
(39 rows)

2 个答案:

答案 0 :(得分:2)

回答我自己的问题。

此查询有两大问题:

  1. 6 LEFT JOIN生成笛卡尔积(即使在小数据集上也会产生数十亿的记录)。
  2. DISTINCT必须对那十亿个记录数据集进行排序。
  3. 所以我必须消除那些。

    我这样做的方法是将JOIN替换为2个子查询(这里不提供它,因为它应该非常明显)。

    结果,实际时间从~700-800ms下降到~45ms,这或多或少是可以接受的。

答案 1 :(得分:1)

大部分时间花在磁盘排序上,您应该通过更改work_mem来使用RAM:

SET work_mem TO '20MB';

再次检查EXPLAIN ANALYZE