我对PostgreSQL 9.6.5有疑问。
我想从一个表中选择不在另一个表中的用户。我的要求是合乎逻辑的,在某些情况下“有效”,而在其他情况下则不然。
让我向你解释一下,我提出了这个要求:
SELECT count(*)
FROM api.user_accounts AS aua
LEFT OUTER JOIN teams.user_accounts AS tua ON aua.id = tua.api_user_account_id
INNER JOIN public.user_accounts AS pua ON aua.user_account_id = pua.id
WHERE tua.id IS NULL
AND pua.creation_date > '2018-03-01' AND pua.creation_date < '2018-04-01'
这个请求不起作用(无限循环或很长的请求?),所以我运行了EXPLAIN SELECT...
,我得到了这个:
Aggregate (cost=7743.42..7743.43 rows=1 width=8)
-> Nested Loop (cost=394.68..7743.42 rows=1 width=0)
Join Filter: (aua.user_account_id = pua.id)
-> Hash Left Join (cost=394.68..2175.00 rows=1 width=4)
Hash Cond: (aua.id = tua.api_user_account_id)
Filter: (tua.id IS NULL)
-> Seq Scan on user_accounts aua (cost=0.00..1396.55 rows=72155 width=8)
-> Hash (cost=253.19..253.19 rows=11319 width=8)
-> Seq Scan on user_accounts tua (cost=0.00..253.19 rows=11319 width=8)
-> Seq Scan on user_accounts pua (cost=0.00..5460.89 rows=8603 width=4)
Filter: ((creation_date > '2018-03-01 00:00:00+01'::timestamp with time zone) AND (creation_date < '2018-04-01 00:00:00+02'::timestamp with time zone))
现在,准备好了,我通过更改日期重新启动此请求,而不是将所有日期都设置为'2018-04-01'
以下,我得到'2018-04-02'
以下的日期。那工作!!
请求第二次尝试:
SELECT count(*)
FROM api.user_accounts AS aua
LEFT OUTER JOIN teams.user_accounts AS tua ON aua.id = tua.api_user_account_id
INNER JOIN public.user_accounts AS pua ON aua.user_account_id = pua.id
WHERE tua.id IS NULL
AND pua.creation_date > '2018-03-01' AND pua.creation_date < '2018-04-02'
----------------------------------------------change here-----------------^
解释第二次尝试:
Aggregate (cost=7669.27..7669.28 rows=1 width=8)
-> Hash Join (cost=2175.01..7669.26 rows=1 width=0)
Hash Cond: (pua.id = aua.user_account_id)
-> Seq Scan on user_accounts pua (cost=0.00..5460.89 rows=8895 width=4)
Filter: ((creation_date > '2018-03-01 00:00:00+01'::timestamp with time zone) AND (creation_date < '2018-04-02 00:00:00+02'::timestamp with time zone))
-> Hash (cost=2175.00..2175.00 rows=1 width=4)
-> Hash Left Join (cost=394.68..2175.00 rows=1 width=4)
Hash Cond: (aua.id = tua.api_user_account_id)
Filter: (tua.id IS NULL)
-> Seq Scan on user_accounts aua (cost=0.00..1396.55 rows=72155 width=8)
-> Hash (cost=253.19..253.19 rows=11319 width=8)
-> Seq Scan on user_accounts tua (cost=0.00..253.19 rows=11319 width=8)
编辑:
tua.id IS NOT NULL
时,这是有效的,但我得到的结果与预期的相反。编辑2: 我找到了一个解决方案,但看到它并不美丽......
SELECT COUNT(*) FROM
(SELECT aua.* FROM api.user_accounts AS aua
INNER JOIN public.user_accounts AS pua ON aua.user_account_id = pua.id
WHERE pua.creation_date > '2018-03-01' AND pua.creation_date < '2018-04-01'
LIMIT (SELECT COUNT(*) FROM api.user_accounts)
) AS aua
LEFT OUTER JOIN teams.user_accounts AS tua ON aua.id = tua.api_user_account_id
WHERE tua.id IS NULL
那么,为什么2个查询的PostgreSQL解释如此不同?为什么第二个工作而不是第一个?我不明白发生了什么。