如何在PostgreSQL中优化“JOIN”

时间:2016-12-12 20:14:41

标签: sql postgresql

我有四张表来提取user: first_namemongouser: email, card_statustransaction: transaction_type, balance, posted_at, is_atm, is_purchaseuser_login: user_id, login_date, login_id ...

的信息

在我添加第四个表 - user_login之前,一切都很有效。然而,第四个JOIN使一切都变得缓慢。我写了如下所示的查询

SELECT * FROM 
(SELECT
ssluserid,
first_name,
m.email,
zipcode,
date_part('year',age(birthday)) AS birthday,
(current_date - DATE(created_date)) AS duration,
CASE WHEN card_status = 'ACTIVE' THEN 1 ELSE 0 END AS IS_ACTIVE,
SUM(CASE WHEN transaction_type = 'Credit' AND balance > 1.00 THEN balance END) AS LOAD_AMT,
SUM(CASE WHEN transaction_type = 'Debit' AND balance > 1.00 THEN balance END) AS SPEND_AMT,
COUNT(CASE WHEN transaction_type = 'Credit' AND balance > 1.00 THEN balance END) AS LOAD_CT,
COUNT(CASE WHEN transaction_type = 'Debit' AND balance > 1.00 THEN balance END) AS SPEND_CT,
MIN(CASE WHEN transaction_type = 'Credit' AND balance > 1.00 THEN DATE(posted_at) END) AS FIRST_LOAD,
MAX(CASE WHEN transaction_type = 'Credit' AND balance > 1.00 THEN DATE(posted_at) END) AS LAST_LOAD,
MIN(CASE WHEN transaction_type = 'Debit' AND balance > 1.00 THEN DATE(posted_at) END) AS FIRST_SPEND,
MAX(CASE WHEN transaction_type = 'Debit' AND balance > 1.00 THEN DATE(posted_at) END) AS LAST_SPEND,
  SUM(CASE WHEN transaction_type = 'Debit' AND is_atm = 't' AND DATE(posted_at) >= CURRENT_DATE - INTERVAL '90 days'
                                    THEN balance END) AS ATM_AMT,
  SUM(CASE WHEN transaction_type = 'Debit' AND is_purchase = 't' AND DATE(posted_at) >= CURRENT_DATE - INTERVAL '90 days'
                                    THEN balance END) AS POS_AMT,
  SUM(CASE WHEN transaction_type = 'Credit' AND balance > 1.00 AND DATE(posted_at) >= CURRENT_DATE - INTERVAL '90 days' 
                                    THEN balance END) AS LOAD_VOL,
  COUNT(CASE WHEN DATE(login_date) >= CURRENT_DATE - INTERVAL '90 days' THEN 
login_id END) AS CT_LOGIN
FROM
mongouser m
LEFT OUTER JOIN
user u
ON m.userid = u.id
LEFT OUTER JOIN transactions t
ON u.id = t.user_id
LEFT OUTER JOIN user_login l
ON m.userid = l.user_id
GROUP BY 1,2,3,4,5,6,7) t
WHERE LAST_LOAD >= CURRENT_DATE - INTERVAL '90 days'
ORDER BY 9 DESC;

此查询已运行近40分钟......有没有什么方法可以优化它?

1 个答案:

答案 0 :(得分:1)

专注于您的陈述,您知道问题所在。你之前有过这个

LEFT OUTER JOIN user u
ON m.userid = u.id

你说事情并不慢。"然后你添加它,

LEFT OUTER JOIN user_login l
ON m.userid = l.user_id

你说事情变慢了。您可能在m.userid上有索引。你有l.user_id的索引吗?

CREATE INDEX foo ON user_login ( user_id );