删除列中的重复文本

时间:2019-01-14 20:05:48

标签: sql sql-server tsql

在临时表中,我有一栏列出了可能重复的电子邮件地址。例如:

Row#1: test@gmail.com; test@gmail.com; test@yahoo.com; abc@gmail.com
Row#2: abc@yahoo.com; abcde@yahoo.com; abcde@yahoo.com

所需结果:

Row#1: test@gmail.com; test@yahoo.com; abc@gmail.com
Row#2: abc@yahoo.com; abcde@yahoo.com

是否可以使用SQL Server语言实现此目的?

1 个答案:

答案 0 :(得分:7)

好吧,假设使用SQL Server 2017,并且您具有键列(或列组合),则可以同时使用STRING_SPLITSTRING_AGG

WITH CTE AS
(
    SELECT  DISTINCT 
                T.KeyColumn,
                E.Value Email
    FROM dbo.YourTable T
    OUTER APPLY STRING_SPLIT(Email,';') E
)
SELECT  KeyColumn,
        STRING_AGG(Email,';') Email
FROM CTE
GROUP BY KeyColumn
;

SQL Server 2016的更新:

如果没有STRING_AGG,则必须使用一种旧方法;例如:

WITH CTE AS
(
    SELECT  DISTINCT 
                T.KeyColumn,
                E.Value Email
    FROM dbo.YourTable T
    OUTER APPLY STRING_SPLIT(Email,';') E
)
SELECT  t.KeyColumn,
        Email = STUFF(( SELECT ';' + CONVERT(varchar(255),Email)
                        FROM CTE
                        WHERE KeyColumn = t.KeyColumn
                        FOR XML PATH(''), TYPE).value('.[1]','nvarchar(max)'),1,1,'')
FROM CTE t
GROUP BY t.KeyColumn
;