如何改善这4个自我加入?

时间:2014-11-06 18:17:14

标签: mysql sql performance join self-join

假设我有以下样本数据集:

emplid | Citizenship |
100001 | USA         |
100001 | CAN         |
100001 | CHN         |
100002 | USA         |
100002 | CHN         |
100003 | USA         |

我想安排它在一行中显示每位员工的公民身份。我们可以假设一名员工最多有四个国籍。输出看起来像这样:

emplid | Citizeship_1 | Citizenship_2 | Citizenship_3
100001 | USA          | CHN           | CAN
100002 | USA          | CHN           |
100003 | USA          |               |

我能够实现这一目标的唯一可行解决方案:

SELECT e.emplid, MAX(e.citizenship) AS citizenship1, 
                 MAX(e1.citizenship) AS citizenship2, 
                 MAX(e2.citizenship) AS citizenship3, 
                 MAX(e3.citizenship) AS citizenship4
FROM employee e
LEFT JOIN employee e1 ON e1.emplid = e.emplid AND e1.citizenship < e.citizenship
LEFT JOIN employee e2 ON e2.emplid = e1.emplid AND e2.citizenship < e1.citizenship
LEFT JOIN employee e3 ON e3.emplid = e2.emplid AND e3.citizenship < e2.citizenship
GROUP BY e.emplid

随着数据集的增长和增长,这变得越来越低效,但我找不到重写此查询的方法。

2 个答案:

答案 0 :(得分:1)

为什么不将公民身份连接到一个列表?

select e.emplid, group_concat(citizenship) as citizenships
from employee e
group by e.emplid;

如果您想要有四个单独的列,可以执行以下操作:

select e.emplid,
       substring_index(group_concat(citizenship), ',', 1) as c1,
       (case when count(*) >= 2
             then substring_index(substring_index(group_concat(citizenship), ',', 2), ',', -1)
        end) as c2,
       (case when count(*) >= 3
             then substring_index(substring_index(group_concat(citizenship), ',', 3), ',', -1)
        end) as c3,
       (case when count(*) >= 4
             then substring_index(substring_index(group_concat(citizenship), ',', 4), ',', -1)
        end) as c4
from employee e
group by e.emplid;

答案 1 :(得分:0)

此解决方案按字母顺序对每位员工的公民身份进行排名,然后将结果放入相应的列中。

SELECT 
    emplid,
    MAX(CASE WHEN R = 1 THEN Citizenship ELSE NULL END) AS Citizeship_1,
    MAX(CASE WHEN R = 2 THEN Citizenship ELSE NULL END) AS Citizeship_2,
    MAX(CASE WHEN R = 3 THEN Citizenship ELSE NULL END) AS Citizeship_3
FROM    
    (SELECT emplid,Citizenship,RANK() OVER(PARTITION BY emplid ORDER BY Citizenship) AS R FROM @T) AS DATA
GROUP BY
    emplid