MySQL - 加入相关子查询

时间:2013-05-09 21:37:03

标签: mysql sql

我正在编写一个查询,将用户及其各自的域分配给IP地址。没有IP地址可能有重复的用户。

以下是我在SQL小提琴中所获得的内容:http://sqlfiddle.com/#!2/39c51/2/0

我有一张表,其中包含所有(数十万)当前分配。较小规模的示例如下所示:

mysql> select * from test.usermap;
+-------------+-------+-------------------+
| vip         | user  | domain            |
+-------------+-------+-------------------+
| 100.50.20.1 | joe   | joesdomain.com    |
| 100.50.20.1 | bob   | joesdomain.com    |
| 100.50.20.2 | tom   | domain2.com       |
| 100.50.20.2 | fred  | domain2.com       |
| 100.50.20.2 | sally | domain2.com       |
| 100.50.20.3 | admin | athriddomain.com  |
| 100.50.20.4 | admin | numberfour.com    |
| 100.50.20.3 | sally | fivewithsally.com |
| 100.50.20.4 | jim   | thesix.com        |
| 100.50.20.1 | admin | seven.com         |
| 100.50.20.1 | sally | seven.com         |
| 100.50.20.1 | sue   | seven.com         |
| 100.50.20.5 |       |                   |
| 100.50.20.6 |       |                   |
+-------------+-------+-------------------+
14 rows in set (0.00 sec)

我有另一个表,其中包含尚未分配的用户,同样是一个小规模示例:

mysql> select * from test.newusers;
+-------+-----------+
| user  | domain    |
+-------+-----------+
| jim   | eight.com |
| sally | eight.com |
| admin | nine.com  |
| james | ten.com   |
| jane  | ten.com   |
+-------+-----------+
5 rows in set (0.00 sec)

这里的想法是将所有八星以下的用户分配到.5,因为这是最早的既没有'吉姆'也没有'莎莉'的IP,然后是nine.com到.2和ten.com到。 1,因为他们各自的用户冲突(或缺乏)。

我正在寻找的结果如下:

+-------------+-------+-----------+
| vip         | user  | domain    |
+-------------+-------+-----------+
| 100.50.20.1 | james | ten.com   |
| 100.50.20.1 | jane  | ten.com   |
| 100.50.20.2 | admin | nine.com  |
| 100.50.20.5 | jim   | eight.com |
| 100.50.20.5 | sally | eight.com |
+-------------+-------+-----------+
5 rows in set (0.01 sec)

我可以使用相关子查询中的子查询来执行此操作,如下所示:

mysql> select  
(
    select vip 
    from test.usermap
    where vip not in
    (
        select distinct vip 
        from test.usermap  
        where user in
        (
            select user 
            from test.newusers 
            where domain = n.domain
        )
    )
    order by inet_aton(vip) asc
    limit 1
) as vip, n.user, n.domain 
from test.newusers n
order by inet_aton(vip) asc;
+-------------+-------+-----------+
| vip         | user  | domain    |
+-------------+-------+-----------+
| 100.50.20.1 | james | ten.com   |
| 100.50.20.1 | jane  | ten.com   |
| 100.50.20.2 | admin | nine.com  |
| 100.50.20.5 | jim   | eight.com |
| 100.50.20.5 | sally | eight.com |
+-------------+-------+-----------+
5 rows in set (0.00 sec)

但这是非常低效的,我的制作映射和newusers表分别是300k和50k行,所以这是不可能的。

我试图通过使用连接而不是嵌套子查询来提高效率,所以我用连接替换了内部查询并在ON子句中列出了外部查询的列,但似乎这是不可能的:< / p>

mysql> select 
(
    select distinct vip 
    from test.usermap u 
    join test.newusers r
        on r.domain = n.domain
        and r.user != u.user
    order by inet_aton(vip) asc limit 1
) as vip, n.user, n.domain
from test.newusers n;
ERROR 1054 (42S22): Unknown column 'n.domain' in 'on clause'
mysql> 

虽然查询本身的逻辑是有道理的,因为用字符串常量替换外部查询引用,它代表可以正常工作:

mysql> select
(
    select distinct vip 
    from test.usermap u 
    join test.newusers r
        on r.domain = 'ten.com'
        and r.user != u.user
    order by inet_aton(vip) asc limit 1
) as vip, n.user, n.domain
from test.newusers n
where domain = 'ten.com';
+-------------+-------+---------+
| vip         | user  | domain  |
+-------------+-------+---------+
| 100.50.20.1 | james | ten.com |
| 100.50.20.1 | jane  | ten.com |
+-------------+-------+---------+
2 rows in set (0.00 sec)

我的问题是:有没有办法在内部查询的引号内引用外部查询中的列?如果没有,那么在没有以低效方式嵌套子查询的情况下存在哪种(如果有的话)?

同样,我在这里有一个小提琴:http://sqlfiddle.com/#!2/39c51/2/0

1 个答案:

答案 0 :(得分:3)

我不确定这会有多大(如果有的话)效率,但是可以在不嵌套多个子查询的情况下重写查询:

SELECT  INET_NTOA(MIN(INET_ATON(UserMap.VIP))) AS VIP,
        NewUsers.User, 
        NewUsers.Domain
FROM    NewUsers
        CROSS JOIN UserMap
        LEFT JOIN
        (   SELECT  u.Domain, m.VIP
            FROM    NewUsers u
                    INNER JOIN UserMap m
                        ON u.User = m.User
        ) ex
            ON ex.Domain = NewUsers.Domain
            AND ex.VIP = UserMap.VIP
WHERE   ex.Domain IS NULL
GROUP BY NewUsers.User, NewUsers.Domain
ORDER BY VIP ASC;   

<强> Example on your SQL Fiddle

<强>附录

上面的查询不会返回没有VIP可用的行,例如如果从100.50.20.5移除100.50.20.1UserMap,您的查询将返回:

VIP             USER    DOMAIN
NULL            jim     eight.com
NULL            sally   eight.com
100.50.20.1     james   ten.com
100.50.20.1     jane    ten.com
100.50.20.2     admin   nine.com

我写的查询只会返回VIP不为空的行:

VIP             USER    DOMAIN
100.50.20.1     james   ten.com
100.50.20.1     jane    ten.com
100.50.20.2     admin   nine.com

要解决此问题,您可以使用UNION:

SELECT  INET_NTOA(MIN(INET_ATON(a.VIP))) AS VIP,
        a.User, 
        a.Domain
FROM    (   SELECT  UserMap.VIP,
                    NewUsers.User, 
                    NewUsers.Domain
            FROM    NewUsers
                    CROSS JOIN UserMap
                    LEFT JOIN
                    (   SELECT  u.Domain, m.VIP
                        FROM    NewUsers u
                                INNER JOIN UserMap m
                                    ON u.User = m.User
                    ) ex
                        ON ex.Domain = NewUsers.Domain
                        AND ex.VIP = UserMap.VIP
            WHERE   ex.Domain IS NULL
            UNION ALL
            SELECT  NULL AS VIP,
                    NewUsers.User,
                    NewUsers.Domain
            FROM    NewUsers
        ) a
GROUP BY a.User, a.Domain
ORDER BY VIP ASC;

<强> Revised Example on SQL Fiddle

我不确定您处理没有VIP可用的情况的逻辑是什么,所以无法真正建议解决这一部分。但是你可以使用这个来获得下一个VIP:

SELECT  INET_NTOA(MAX(INET_ATON(UserMap.VIP)) + 1) AS NextVIP
FROM    UserMap

您的问题的另一个问题是NewUsers中的冲突,例如如果您的NewUsers表包含这些记录:

('jim','eight.com'),
('sally','eight.com'),
('jim','eleven.com'),
('sally','eleven.com');

您的查询和我的查询都会将所有这些内容分配给VIP 100.50.20.5。如果这可能发生,我认为解决这个问题的最佳方法是在任何时候只插入来自一个域的用户名。但只需使用JOIN即可完成:

为简化查询,我创建了2个视图

CREATE VIEW UsedVIP
AS
    SELECT  u.Domain, m.VIP
    FROM    NewUsers u
            INNER JOIN UserMap m
                ON u.User = m.User;

CREATE VIEW NewUserMap 
AS
    SELECT  UserMap.VIP,
            NewUsers.User, 
            NewUsers.Domain
    FROM    NewUsers
            CROSS JOIN UserMap
            LEFT JOIN UsedVIP ex
                ON ex.Domain = NewUsers.Domain
                AND ex.VIP = UserMap.VIP
    WHERE   ex.Domain IS NULL;

最后的查询是:

SELECT  INET_NTOA(MIN(INET_ATON(a.VIP))) AS VIP,
        a.User, 
        a.Domain
FROM    NewUserMap a
        LEFT JOIN NewUserMap b
            ON a.User = b.user
            AND a.VIP = b.VIP
            AND a.Domain > b.domain
        LEFT JOIN NewUserMap c
            ON a.User = c.user
            AND b.Domain = c.domain
            AND b.VIP < c.VIP
WHERE   c.user IS NULL
GROUP BY a.User, a.Domain
ORDER BY VIP ASC;

返回:

VIP             USER    DOMAIN
100.50.20.1     jane    ten.com
100.50.20.1     james   ten.com
100.50.20.2     admin   nine.com
100.50.20.5     sally   eight.com
100.50.20.5     jim     eight.com
100.50.20.6     jim     eleven.com
100.50.20.6     sally   eleven.com

<强> Example on SQL Fiddle