我正在评估postgres从Oracle迁移。以下查询运行速度太慢,请查找解释计划:
explain analyze select DISTINCT EVENT.ID,
ORIGIN.ID AS RIGINID,EVENT.PREFERRED_ORIGIN_ID AS PREFERRED_ORIGIN,
EVENT.CONTRIBUTOR, ORIGIN.TIME, ORIGIN.LATITUDE, ORIGIN.LONGITUDE,
ORIGIN.DEPTH, ORIGIN.EVTYPE, ORIGIN.CATALOG, ORIGIN.AUTHOR OAUTHOR,
ORIGIN.CONTRIBUTOR OCONTRIBUTOR,
MAGNITUDE.ID AS MAGID, MAGNITUDE.MAGNITUDE, MAGNITUDE.TYPE AS MAGTYPE
from event.event
left join event.origin on event.id = origin.eventid
left join event.magnitude on origin.id = event.magnitude.origin_id
WHERE EXISTS(
select origin_id from event.magnitude
where magnitude.magnitude>=7.2
and origin.id=origin_id)
order by ORIGIN.TIME desc, MAGNITUDE.MAGNITUDE desc,
EVENT.ID,EVENT.PREFERRED_ORIGIN_ID,ORIGIN.ID
"Unique (cost=740549.86..741151.42 rows=15039 width=80) (actual time=17791.557..17799.092 rows=5517 loops=1)"
" -> Sort (cost=740549.86..740587.45 rows=15039 width=80) (actual time=17791.556..17792.220 rows=5517 loops=1)"
" Sort Key: origin."time", event.magnitude.magnitude, event.id, event.preferred_origin_id, origin.id, event.contributor, origin.latitude, origin.longitude, origin.depth, origin.evtype, origin.catalog, origin.author, origin.contributor, event.magnitude.id, event.magnitude.type"
" Sort Method: quicksort Memory: 968kB"
" -> Nested Loop Left Join (cost=34642.50..739506.42 rows=15039 width=80) (actual time=6.927..17769.788 rows=5517 loops=1)"
" -> Hash Semi Join (cost=34642.50..723750.23 rows=14382 width=62) (actual time=6.912..17744.858 rows=2246 loops=1)"
" Hash Cond: (origin.id = event.magnitude.origin_id)"
" -> Merge Left Join (cost=0.00..641544.72 rows=6133105 width=62) (actual time=0.036..16221.008 rows=6133105 loops=1)"
" Merge Cond: (event.id = origin.eventid)"
" -> Index Scan using event_key_index on event (cost=0.00..163046.53 rows=3272228 width=12) (actual time=0.017..1243.616 rows=3276192 loops=1)"
" -> Index Scan using origin_fk_index on origin (cost=0.00..393653.81 rows=6133105 width=54) (actual time=0.013..3033.657 rows=6133105 loops=1)"
" -> Hash (cost=34462.73..34462.73 rows=14382 width=4) (actual time=6.668..6.668 rows=3198 loops=1)"
" Buckets: 2048 Batches: 1 Memory Usage: 113kB"
" -> Bitmap Heap Scan on magnitude (cost=324.65..34462.73 rows=14382 width=4) (actual time=1.682..5.414 rows=3198 loops=1)"
" Recheck Cond: (magnitude >= 7.2)"
" -> Bitmap Index Scan on mag_index (cost=0.00..321.05 rows=14382 width=0) (actual time=1.331..1.331 rows=3198 loops=1)"
" Index Cond: (magnitude >= 7.2)"
" -> Index Scan using mag_fkey_index on magnitude (cost=0.00..1.06 rows=3 width=22) (actual time=0.007..0.009 rows=2 loops=2246)"
" Index Cond: (origin.id = event.magnitude.origin_id)"
"Total runtime: 17799.669 ms"
此查询在Oracle中运行1秒钟,而在postgres中需要16秒,差异告诉我,我在某处做错了。这是在具有12G RAM的本地Mac机器上的新安装。
我有:
effective_cache_size=4096MB
shared_buffer=2048MB
work_mem=100MB
答案 0 :(得分:1)
从DISTINCT
子句中删除SELECT
。这是多余的,因为无论如何你都选择PRIMARY KEY
列。
将LEFT JOIN
更改为INNER JOIN
。正如其他人所提到的,您的EXISTS
子句会消除NULL
生成的所有LEFT JOIN
输出。
创建两个索引:
magnitude (magnitude, origin_id)
magnitude (origin_id, magnitude)
(根据您的magnitude >= 7.2
条件的选择性,每种方法可能更有效率)
SELECT EVENT.ID,
ORIGIN.ID AS RIGINID,EVENT.PREFERRED_ORIGIN_ID AS PREFERRED_ORIGIN,
EVENT.CONTRIBUTOR, ORIGIN.TIME, ORIGIN.LATITUDE, ORIGIN.LONGITUDE,
ORIGIN.DEPTH, ORIGIN.EVTYPE, ORIGIN.CATALOG, ORIGIN.AUTHOR OAUTHOR,
ORIGIN.CONTRIBUTOR OCONTRIBUTOR,
MAGNITUDE.ID AS MAGID, MAGNITUDE.MAGNITUDE, MAGNITUDE.TYPE AS MAGTYPE
FROM event.event
JOIN event.origin
ON origin.eventid = event.id
JOIN event.magnitude
ON magnitude.origin_id = origin.id
WHERE origin.id IN
(
SELECT origin_id
FROM magnitude
WHERE magnitude >= 7.2
)
ORDER BY
ORIGIN.TIME desc, MAGNITUDE.MAGNITUDE desc,
EVENT.ID,EVENT.PREFERRED_ORIGIN_ID,ORIGIN.ID
要检查的示例查询:
WITH event (id) AS
(
VALUES
(1)
),
origin (id, eventid) AS
(
VALUES
(1, 1),
(2, 1)
),
magnitude (id, origin_id, magnitude) AS
(
VALUES
(1, 1, 4),
(2, 1, 8),
(3, 3, 6)
)
SELECT *
FROM event
LEFT JOIN
origin
ON origin.eventid = event.id
LEFT JOIN
magnitude
ON magnitude.origin_id = origin.id
WHERE EXISTS
(
SELECT origin_id
FROM magnitude
WHERE magnitude.magnitude >= 7.2
AND origin.id = origin_id
)
4
会返回8
和origin 1
这两个等级。
答案 1 :(得分:0)
至少在您发布此问题的其他论坛中已经回答过,将“在event.id = origin.eventid上的左连接event.origin”部分更改为内部联接,因为您的EXISTS查询无论如何,进一步向下限制它。
答案 2 :(得分:0)
有一些有用的链接可供学习如何在PostgreSQL wiki解释您自己的执行计划。
答案 3 :(得分:0)
根据计划:
排序(成本= 740549.86..740587.45行= 15039宽度= 80)(实际时间= 17791.55 ......
排序运营商是主要的瓶颈。尝试删除order by
声明?