从表中删除重复的ID - 性能改进

时间:2016-08-26 14:38:47

标签: sql sql-server sql-server-2014 query-performance

我有一个包含重复代码的表格,我需要清理表格,删除重复的表格,但在表格中至少有一个表格。

我的表是这样的:

FriendlyFunctionCode      MemberFirmId     FunctionLevel3Desc
1                         Value1           Value2
1                         Value2           Value3
2                         Value4           Value5

我需要这样的事情:(留下哪一行并不重要,只要至少有一行)

FriendlyFunctionCode      MemberFirmId     FunctionLevel3Desc
1                         Value1           Value2
2                         Value4           Value5

我有这个查询,但性能很糟糕

SELECT MemberFirmId, FriendlyFunctionCode
INTO #ToDeleteRepeated
FROM [dbo].[FirmFunction]
GROUP BY MemberFirmId, FriendlyFunctionCode
HAVING COUNT(1) > 1

DECLARE @Code VARCHAR(100), @Desc VARCHAR(250)

WHILE ((SELECT COUNT(1) FROM #ToDeleteRepeated) > 0) 
BEGIN
    SELECT TOP 1 @Code = FriendlyFunctionCode FROM #ToDeleteRepeated
    WHILE ((SELECT COUNT(1) FROM [FirmFunction] WHERE FriendlyFunctionCode = @Code) > 0) 
    BEGIN
        SELECT TOP 1 @Desc = FunctionLevel3Desc FROM [FirmFunction] WHERE FriendlyFunctionCode = @Code
        DELETE FROM [FirmFunction] WHERE FriendlyFunctionCode = @Code AND FunctionLevel3Desc = @Desc
    END
END

有什么建议吗?

4 个答案:

答案 0 :(得分:3)

WITH CTE AS (SELECT MemberFirmId, FriendlyFunctionCode, 
                ROW_NUMBER() over (PARTITION by FriendlyFunctionCode      ORDER BY FriendlyFunctionCode      ) AS RN
                FROM [dbo].[FirmFunction]
        )
        DELETE CTE WHERE CTE.RN >1

答案 1 :(得分:2)

使用带有row_number()的CTE删除

;with cte as (
select *, row_number() over(partition by friendlyfunctioncode order by memberfirmid) rn
 from deletingtable)
delete from cte where rn > 1

执行以下执行计划:

表/聚集索引扫描 - >排序(如果没有索引) - >段 - >序列项目 - >过滤然后删除,

如果它在FriendlyFunctionCode上有适当的索引,它在单次扫描中执行得更快

答案 2 :(得分:1)

您可以使用这样的窗口函数。保存必须使用游标(在SQL Server中表现不佳)。你可以自己运行内部选择,看看它在行号上做了什么。

测试数据

CREATE TABLE #TestData (FriendlyFunctionCode int, MemberFirmId nvarchar(10), FunctionLevel3Desc nvarchar(10))
INSERT INTO #TestData
VALUES
(1,'Value1','Value2')
,(1,'Value2','Value3')
,(2,'Value4','Value5')

查询

SELECT
a.FriendlyFunctionCode
,a.MemberFirmId
,a.FunctionLevel3Desc
INTO #SavedData
FROM
(
    SELECT
    FriendlyFunctionCode
    ,MemberFirmId
    ,FunctionLevel3Desc
    ,ROW_NUMBER() OVER(PARTITION BY FriendlyFunctionCode ORDER BY FriendlyFunctionCode) RowNum
    FROM #TestData
) a
WHERE a.RowNum = 1

TRUNCATE TABLE #TestData

INSERT INTO #TestData (FriendlyFunctionCode, MemberFirmId, FunctionLevel3Desc)
SELECT
FriendlyFunctionCode
,MemberFirmId
,FunctionLevel3Desc
FROM #SavedData

DROP TABLE #SavedData

结果

FriendlyFunctionCode    MemberFirmId    FunctionLevel3Desc  
1                       Value1          Value2              
2                       Value4          Value5                  

答案 3 :(得分:0)

您可以在FunctionCode上使用MAX和group。

TRUNCATE TABLE FirmFunction

INSERT INTO FirmFunction (FriendlyFunctionCode,MemberFirmId,FunctionLevel3Desc)
SELECT * FROM #StagingTable

然后截断你的表格,然后选择回来......或者只是一起创建一个表格并将不同的(最大)记录插入其中。

SELECT TOP 1 INTO FirmFunction2 FROM FirmFunction WHERE 1=0

INSERT INTO FirmFunction2 (FriendlyFunctionCode, MemberFirmId, FunctionLevel3Desc)
SELECT 
       FriendlyFunctionCode, 
       MAX(MemberFirmId) as MemberFirmId, 
       MAX(FunctionLevel3Desc) as FuncationLevel3Desc
    INTO #StagingTable
    FROM
       FirmFunction
    GROUP BY
       FriendlyFunctionCode

这比创建一个表FirmFunction2更安全,例如使用与原始模式相同的模式,然后只是插入它,然后重命名它....

{{1}}

然后你可以检查FirmFunction2中的日期,如果你满意......在删除另一个表后重命名它。

相关问题