连接两个查询返回的行比预期的多得多吗?

时间:2019-03-27 21:22:49

标签: sql postgresql join window-functions

我有两个查询。它们都返回约60行。但是加入它们后,它们返回900行。有没有一种方法可以使60行同时加入。

查询1:

SELECT 
    f.id_user,
    f.topup_date,
    f.topup_value,
    LEAD(f.topup_date) OVER (PARTITION BY(f.id_user) ORDER BY f.topup_date DESC),
    f.topup_date::timestamp - LEAD(f.topup_date::timestamp) OVER (PARTITION BY(f.id_user) ORDER BY f.topup_date DESC),
    CASE WHEN f.topup_value >= 20 THEN 'Y' ELSE 'N' end,
    CASE WHEN f.topup_value >= 20 THEN LEAD(f.topup_date) OVER (PARTITION BY (f.id_user) ORDER BY f.topup_date DESC) END
FROM topups AS f

查询2:

SELECT 
    CAST(t2.topup_value as float)/CAST(t1.topup_value as float) 
FROM (
    SELECT 
        t1.id_user,
        t1.topup_value,
        ROW_NUMBER() OVER (PARTITION BY t1.id_user ORDER BY t1.topup_date ) AS rowrank
    FROM topups t1 
) AS t1 
INNER JOIN topups t2 ON t1.id_user=t2.id_user
WHERE t1.rowrank = 1
GROUP BY
    t2.id_user,
    t2.topup_value,
    t2.topup_date,
    t1.topup_value,
    t1.rowrank
ORDER BY 
    t2.id_user,
    t2.topup_date DESC

联合查询:

SELECT 
    f.id_user,
    f.topup_date,
    f.topup_value,
    LEAD(f.topup_date) OVER (PARTITION BY(f.id_user) ORDER BY f.topup_date DESC),
    f.topup_date::timestamp - LEAD(f.topup_date::timestamp) OVER (PARTITION BY(f.id_user) ORDER BY f.topup_date DESC),
    CASE WHEN f.topup_value >= 20 then 'Y' ELSE 'N' END,
    CASE WHEN f.topup_value >= 20 THEN LEAD(f.topup_date) OVER (PARTITION BY (f.id_user) ORDER BY f.topup_date desc) END,
    CAST(t2.topup_value AS float)/CAST(t1.topup_value AS float) 
FROM (
    SELECT 
        t1.id_user,
        t1.topup_value,
        ROW_NUMBER() OVER (PARTITION BY t1.id_user ORDER BY t1.topup_date ) AS rowrank
    FROM topups t1
) AS t1 
INNER JOIN topups t2 ON t1.id_user = t2.id_user 
INNER JOIN topups f  ON f.id_user = t2.id_user
WHERE t1.rowrank = 1
GROUP BY 
    f.id_user,
    f.topup_date,
    f.topup_value,
    t2.topup_value,
    t1.topup_value,
    t2.id_user,
    t2.topup_date
ORDER BY 
    t2.id_user,
    t2.topup_date DESC, 
    f.id_user,
    f.topup_date DESC

1 个答案:

答案 0 :(得分:0)

您要合并两个查询结果。对于一个查询结果中的每一行,您希望在另一查询结果中找到一行。因此,请查看第一个查询结果中的第一行。您似乎想将其与第二个查询结果中的确切一行连接起来。这是哪一行?您比较哪些列才能找到此匹配行?

假设这些是您的查询结果:

col1 | col4 | col7 | col6 | col3
-----+------+------+------+-----
A    | B    |  100 |  110 | E
A    | B    |   19 |   22 | E
F    | G    |   80 |   78 | H
F    | I    |   22 |   12 | J

col4 | col2 | col1 | col3 | col8
-----+------+------+------+-----
B    |  333 | A    | E    |   89
B    |  211 | A    | E    |   84
G    |  815 | F    | H    |   77
I    |  639 | F    | J    |   79

您想要这样的结果:

col1 | col4 | col7 | col6 | col3 | col4 | col2 | col1 | col3 | col8
-----+------+------+------+------+------+------+------+------+-----
A    | B    |  100 |  110 | E     | B    |  333 | A    | E    |   89
A    | B    |   19 |   22 | E     | B    |  211 | A    | E    |   84
F    | G    |   80 |   78 | H     | G    |  815 | F    | H    |   77
F    | I    |   22 |   12 | J     | I    |  639 | F    | J    |   79

但是您得到的却是这样的东西:

col1 | col4 | col7 | col6 | col3 | col4 | col2 | col1 | col3 | col8
-----+------+------+------+------+------+------+------+------+-----
A    | B    |  100 |  110 | E     | B    |  333 | A    | E    |   89
A    | B    |  100 |  110 | E     | B    |  211 | A    | E    |   84
A    | B    |   19 |   22 | E     | B    |  333 | A    | E    |   89
A    | B    |   19 |   22 | E     | B    |  211 | A    | E    |   84
F    | G    |   80 |   78 | H     | G    |  815 | F    | H    |   77
F    | G    |   80 |   78 | J     | I    |  639 | F    | J    |   79
F    | I    |   22 |   12 | H     | G    |  815 | F    | H    |   77
F    | I    |   22 |   12 | J     | I    |  639 | F    | J    |   79

之所以得到这样的结果,是因为您选择了一个列来将两个查询结果连接在一起(在您的情况下为id_user,在我的情况下为col1)。查看上面第一个查询结果的第一行。它有col1 = 'A'。如果我在col1上加入第二个查询结果,那么将有两个匹配的行,因为第二个查询结果有两个带有col1 = 'A'的行。我最终得到的比赛比我想要的还要多。

那么,我们要匹配哪些列?在我的示例中,它是col1col3col4。再次查看第一个查询结果的第一行。它有col1 = 'A' and col3 = 'B' and col4 = 'E'。第二个结果集中只有一行与col1 = 'A' and col3 = 'B' and col4 = 'E'相匹配。因此,我的查询将是

select *
from (<query 1 here>) q1
join (<query 2 here>) q2 on q2.col1 = q1.col1 and q2.col3 = q1.col3 and q2.col4 = q1.col4;

或者我宁愿明确地说出我想在结果中看到哪些列,并删除重复的列:

select q1.col1, q2.col4, q1.col7, q1.col6, q1.col3, q2.col2, q2.col8
from (<query 1 here>) q1
join (<query 2 here>) q2 on q2.col1 = q1.col1 and q2.col3 = q1.col3 and q2.col4 = q1.col4
order by q1.col1, q2.col4, q1.col7;