左外连接

时间:2018-06-14 11:47:23

标签: sql postgresql date join

我对PostgreSQL 9.6.5有疑问。

我想从一个表中选择不在另一个表中的用户。我的要求是合乎逻辑的,在某些情况下“有效”,而在其他情况下则不然。

让我向你解释一下,我提出了这个要求:

          SELECT count(*)
            FROM api.user_accounts AS aua
 LEFT OUTER JOIN teams.user_accounts AS tua ON aua.id = tua.api_user_account_id
      INNER JOIN public.user_accounts AS pua ON aua.user_account_id = pua.id
           WHERE tua.id IS NULL 
             AND pua.creation_date > '2018-03-01' AND pua.creation_date < '2018-04-01'

这个请求不起作用(无限循环或很长的请求?),所以我运行了EXPLAIN SELECT...,我得到了这个:

Aggregate  (cost=7743.42..7743.43 rows=1 width=8)
  ->  Nested Loop  (cost=394.68..7743.42 rows=1 width=0)
        Join Filter: (aua.user_account_id = pua.id)
        ->  Hash Left Join  (cost=394.68..2175.00 rows=1 width=4)
              Hash Cond: (aua.id = tua.api_user_account_id)
              Filter: (tua.id IS NULL)
              ->  Seq Scan on user_accounts aua  (cost=0.00..1396.55 rows=72155 width=8)
              ->  Hash  (cost=253.19..253.19 rows=11319 width=8)
                    ->  Seq Scan on user_accounts tua  (cost=0.00..253.19 rows=11319 width=8)
        ->  Seq Scan on user_accounts pua  (cost=0.00..5460.89 rows=8603 width=4)
              Filter: ((creation_date > '2018-03-01 00:00:00+01'::timestamp with time zone) AND (creation_date < '2018-04-01 00:00:00+02'::timestamp with time zone))

现在,准备好了,我通过更改日期重新启动此请求,而不是将所有日期都设置为'2018-04-01'以下,我得到'2018-04-02'以下的日期。那工作!!

请求第二次尝试:

          SELECT count(*)
            FROM api.user_accounts AS aua
 LEFT OUTER JOIN teams.user_accounts AS tua ON aua.id = tua.api_user_account_id
      INNER JOIN public.user_accounts AS pua ON aua.user_account_id = pua.id
           WHERE tua.id IS NULL 
             AND pua.creation_date > '2018-03-01' AND pua.creation_date < '2018-04-02'
    ----------------------------------------------change here-----------------^

解释第二次尝试:

Aggregate  (cost=7669.27..7669.28 rows=1 width=8)
  ->  Hash Join  (cost=2175.01..7669.26 rows=1 width=0)
        Hash Cond: (pua.id = aua.user_account_id)
        ->  Seq Scan on user_accounts pua  (cost=0.00..5460.89 rows=8895 width=4)
              Filter: ((creation_date > '2018-03-01 00:00:00+01'::timestamp with time zone) AND (creation_date < '2018-04-02 00:00:00+02'::timestamp with time zone))
        ->  Hash  (cost=2175.00..2175.00 rows=1 width=4)
              ->  Hash Left Join  (cost=394.68..2175.00 rows=1 width=4)
                    Hash Cond: (aua.id = tua.api_user_account_id)
                    Filter: (tua.id IS NULL)
                    ->  Seq Scan on user_accounts aua  (cost=0.00..1396.55 rows=72155 width=8)
                    ->  Hash  (cost=253.19..253.19 rows=11319 width=8)
                          ->  Seq Scan on user_accounts tua  (cost=0.00..253.19 rows=11319 width=8)

编辑:

  • 当我将where子句更改为tua.id IS NOT NULL时,这是有效的,但我得到的结果与预期的相反。
  • 我在其他计算机上尝试过,所有请求都有效。问题似乎是局部的。 (我已经检查过版本是否不同)

编辑2: 我找到了一个解决方案,但看到它并不美丽......

SELECT COUNT(*) FROM
    (SELECT aua.* FROM api.user_accounts AS aua
    INNER JOIN public.user_accounts AS pua ON aua.user_account_id = pua.id
    WHERE pua.creation_date > '2018-03-01' AND pua.creation_date < '2018-04-01'
    LIMIT (SELECT COUNT(*) FROM api.user_accounts)
    ) AS aua
LEFT OUTER JOIN teams.user_accounts AS tua ON aua.id = tua.api_user_account_id
WHERE tua.id IS NULL

那么,为什么2个查询的PostgreSQL解释如此不同?为什么第二个工作而不是第一个?我不明白发生了什么。

0 个答案:

没有答案