Question

我继承了一个SQL Server 2008 R2项目，其中包括从另一个表中更新表：

Table1（约150,000行）有3个电话号码字段（Tel1，Tel2，Tel3）
Table2（约有20,000行）有3个电话号码字段（Phone1，Phone2，Phone3）

..当这些数字中的任何一个匹配时，Table1应该更新。

目前的代码如下：

UPDATE t1
SET surname = t2.surname, Address1=t2.Address1, DOB=t2.DOB, Tel1=t2.Phone1, Tel2=t2.Phone2, Tel3=t2.Phone3,
FROM Table1 t1 
inner join Table2 t2
on
(t1.Tel1 = t2.Phone1 and t1.Tel1 is not null) or
(t1.Tel1 = t2.Phone2 and t1.Tel1 is not null) or
(t1.Tel1 = t2.Phone3 and t1.Tel1 is not null) or
(t1.Tel2 = t2.Phone1 and t1.Tel2 is not null) or
(t1.Tel2 = t2.Phone2 and t1.Tel2 is not null) or
(t1.Tel2 = t2.Phone3 and t1.Tel2 is not null) or
(t1.Tel3 = t2.Phone1 and t1.Tel3 is not null) or
(t1.Tel3 = t2.Phone2 and t1.Tel3 is not null) or
(t1.Tel3 = t2.Phone3 and t1.Tel3 is not null);

但是，此查询运行时间超过30分钟。

执行计划表明主要瓶颈是Nested Loop上的聚集索引扫描周围的Table1。两个表都在ID列上有聚簇索引。

由于我的DBA技能非常有限，任何人都可以提出改善此查询性能的最佳方法吗？为每列添加Tel1，Tel2和Tel3的索引是最佳移动，还是可以更改查询以提高性能？

Answer 1

首先，我建议从选择中删除所有OR条件。

看看这是否更快（它将您的更新转换为3个不同的更新）：

UPDATE t1
SET surname = t2.surname, Address1=t2.Address1, DOB=t2.DOB, Tel1=t2.Phone1, Tel2=t2.Phone2, Tel3=t2.Phone3,
FROM Table1 t1 
inner join Table2 t2
on
(t1.Tel1 is not null AND t1.Tel1 IN (t2.Phone1, t2.Phone2, t2.Phone3);

UPDATE t1
SET surname = t2.surname, Address1=t2.Address1, DOB=t2.DOB, Tel1=t2.Phone1, Tel2=t2.Phone2, Tel3=t2.Phone3,
FROM Table1 t1 
inner join Table2 t2
on
(t1.Tel2 is not null AND t1.Tel2 IN (t2.Phone1, t2.Phone2, t2.Phone3);

UPDATE t1
SET surname = t2.surname, Address1=t2.Address1, DOB=t2.DOB, Tel1=t2.Phone1, Tel2=t2.Phone2, Tel3=t2.Phone3,
FROM Table1 t1 
inner join Table2 t2
on
(t1.Tel3 is not null AND t1.Tel3 IN (t2.Phone1, t2.Phone2, t2.Phone3);

Answer 2

首先规范化您的表数据：

insert into Table1Tel 
select primaryKey, Tel1 as 'tel' from Table1 where Tel1 is not null
union select primaryKey, Tel2 from Table1 where Tel2 is not null
union select primaryKey, Tel3 from Table1 where Tel3 is not null

insert into Table2Phone 
select primaryKey, Phone1 as 'phone' from Table2 where Phone1 is not null
union select primaryKey, Phone2 from Table2 where Phone2 is not null
union select primaryKey, Phone3 from Table2 where Phone3 is not null

这些规范化表格是存储电话号码的一种更好的方式，而非附加列。

然后你可以在表格中加入这样的东西：

update t1
set surname = t2.surname, 
    Address1 = t2.Address1, 
    DOB = t2.DOB
from Table1 t1 
     inner join Table1Tel tel
         on t1.primaryKey = tel.primaryKey
     inner join Table2Phone phone
         on tel.tel = phone.phone
     inner join Table2 t2
         on phone.primaryKey = t2.primaryKey

请注意，这并不能解决数据中dupe的基本问题 - 例如，如果您的数据中包含Joe和Jane Bloggs，并且具有相同的电话号码（即使在不同的字段中），您也会将这两个记录更新为是一样的。

Answer 3

请尝试以下查询，并告诉我完成执行需要多长时间。

UPDATE t1
SET surname = t2.surname, Address1=t2.Address1, DOB=t2.DOB, Tel1=t2.Phone1, Tel2=t2.Phone2, Tel3=t2.Phone3,
FROM Table1 t1 
inner join Table2 t2
on (
    '|'+cast(t2.Phone1 as varchar(15)+'|'+cast(t2.Phone1 as varchar(15)+'|'+cast(t2.Phone1 as varchar(15)+'|' LIKE '%|'+cast(t1.Tel1 as varchar(15)+'|%'
    or '|'+cast(t2.Phone1 as varchar(15)+'|'+cast(t2.Phone1 as varchar(15)+'|'+cast(t2.Phone1 as varchar(15)+'|' LIKE '%|'+cast(t1.Tel2 as varchar(15)+'|%'
    or '|'+cast(t2.Phone1 as varchar(15)+'|'+cast(t2.Phone1 as varchar(15)+'|'+cast(t2.Phone1 as varchar(15)+'|' LIKE '%|'+cast(t1.Tel3 as varchar(15)+'|%'
    )

用1 LIKE替换3 OR应该更快。试试吧。

Answer 4

您还可以尝试以下内容，以避免冗余更新。

具有大量JOIN条件的SQL查询非常慢

4 个答案: