我有一个包含审计表的数据库。这些审计表包含主表中的记录副本,从触发器中复制,我需要从这些审计记录中创建一种时间点报告。
每个表都有一个updated_at
列,它是插入记录的时间戳 - 将其与id列或其他内容混合使用,我们可以查明特定记录(我提到这一点,因为id列不能在这些表格中是独一无二的。)
我需要做的是使用一些列过滤器来提取记录,然后只获得最近的"#34;这些记录的版本(因此,通常是MAX(updated_at)
类型的返回集)。然后,我需要加入其他3个审计表并从中获取相关记录 - 但是最近的"记录(最后一个表格,audit_products
我用来获取与相关记录相关联的产品数量的COUNT()
)...但仅限于返回的updated_at
从查询的最初部分 - 这是我的问题。
这是我想要运行的查询(不起作用)...理论上:
SELECT gr.id, gr.title_e, gr.title_f, wp.updated_at, wp.to_epmd_orig, wp.to_epmd_revise, wp.to_epmd_actual, lc.code_e, lc.code_f, pr.total
FROM (
SELECT fw.id, fw.group_id, fw.updated_at, fw.to_epmd_orig, fw.to_epmd_revise, fw.to_epmd_actual, fw.to_epmd_late_code
FROM audit_pc_pub_workplans AS fw
RIGHT JOIN (
SELECT id, MAX(updated_at) AS updated_at
FROM audit_pc_pub_workplans
WHERE completed = 1 AND DATE(to_epmd_actual) BETWEEN '2013-04-01' AND '2014-03-31'
GROUP BY id
) AS aw ON (aw.id = fw.id AND aw.updated_at = fw.updated_at)
) as wp
LEFT JOIN (
SELECT fg.id, fg.updated_at, fg.title_e, fg.title_f
FROM audit_groups AS fg
RIGHT JOIN (
SELECT id, MAX(updated_at) AS updated_at
FROM audit_groups
WHERE updated_at <= wp.updated_at AND del_date IS NULL
GROUP BY id
) AS ag ON (ag.id = fg.id AND ag.updated_at = fg.updated_at)
) AS gr ON (gr.id = wp.group_id)
LEFT JOIN (
SELECT fc.id, fc.code_e, fc.code_f
FROM audit_wp_late_codes AS fc
RIGHT JOIN (
SELECT id, MAX(updated_at) AS updated_at
FROM audit_wp_late_codes
WHERE updated_at <= wp.updated_at AND delete_date IS NULL
GROUP BY id
) AS ac ON (ac.id = fc.id AND ac.updated_at = fc.updated_at)
) AS lc ON (lc.id = wp.to_epmd_late_code)
LEFT JOIN (
SELECT fp.prod_group, COUNT(DISTINCT fp.id) AS total
FROM audit_products AS fp
RIGHT JOIN (
SELECT id, MAX(updated_at) AS updated_at
FROM audit_products
WHERE updated_at <= wp.updated_at AND del_date_2 IS NULL
GROUP BY id
) AS ap ON (ap.id = fp.id AND ap.updated_at = fp.updated_at)
GROUP BY fp.prod_group
) AS pr ON (pr.prod_group = gr.id)
ORDER BY gr.title_e ASC
;
为了完整起见,我已经将所有列名保留在原样,但作为局外人,你可以忽略其中的一些。关键的是id
类型列,主要用于ON ()
语句,以及所有表中的updated_at
列。
问题特别是我在wp.updated_at
中的所有子查询中对LEFT JOIN
的引用。我读了一下,发现在进行连接时我可以使用关键字LATERAL
,这应该允许我公开wp
表,事实上它让我运行查询...但是当我运行那个查询时(顺便说一句,audit_products表有大约8M的记录),我等待2小时后就放弃了。
有没有办法可以在合理的时间内重写此查询?我甚至可以接受15-30分钟的跑步......而不是几个小时。
任何帮助将不胜感激!
答案 0 :(得分:1)
首先,我认为您的查询中不需要外部联接。其次,考虑使用OLAP函数ROW_NUMBER()来检索最新记录 - 它通常比聚合更快。类似的东西:
SELECT gr.id, gr.title_e, ...
FROM (
SELECT fw.id, ... fw.to_epmd_late_code,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY updated_at DESC) rn
FROM audit_pc_pub_workplans
WHERE completed = 1 AND DATE(to_epmd_actual) BETWEEN '2013-04-01' AND '2014-03-31'
) as wp
INNER JOIN LATERAL (
SELECT fg.id, fg.updated_at, fg.title_e, fg.title_f,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY updated_at DESC)
FROM audit_groups
WHERE updated_at <= wp.updated_at and wp.rn = 1 AND del_date IS NULL
) AS gr ON (gr.id = wp.group_id)
...