Question

我有两个SQL2008表，一个是包含新数据的“Import”表，另一个是带有实时数据的“Destination”表。两个表都相似但不相同（CRM系统更新了目标表中的列数），但两个表都有三个“电话号码”字段 - Tel1，Tel2和Tel3。我需要从导入表中删除目标表中已存在任何电话号码的所有记录。

我尝试过一个简单的查询（只是一个SELECT来测试）：

select t2.account_id
from ImportData t2,  Destination t1 
where 
(t2.Tel1!='' AND (t2.Tel1 IN (t1.Tel1,t1.Tel2,t1.Tel3)))
or
(t2.Tel2!='' AND (t2.Tel2 IN (t1.Tel1,t1.Tel2,t1.Tel3)))
or
(t2.Tel3!='' AND (t2.Tel3 IN (t1.Tel1,t1.Tel2,t1.Tel3)))

...但我知道这几乎肯定不是做事的方式，特别是因为它很慢。有人能指出我正确的方向吗？

Answer 1

这个查询需要多一点这个信息。如果您想以有效的方式编写它，我们需要知道每个加载或更多新记录是否有更多重复。我假设account_id是主键并且具有聚簇索引。

我会使用临时表方法创建一个规范化的表#r，其索引在phone_no和account_id上，如

SELECT Phone, Account into #tmp
FROM 
   (SELECT account_id, tel1, tel2, tel3
   FROM destination) p
UNPIVOT
   (Phone FOR Account IN 
      (Tel1, tel2, tel3)
)AS unpvt;

在此表上创建非聚集索引，第一列是电话号码，第二部分是帐号。您无法转义一个全表扫描，因此我假设您可以扫描导入（可能更小）。然后只需加入此表并使用not exists限定符，如下所述。然后当然在处理后丢弃表格路加

Answer 2

存在会使查询短路，而不像连接那样对表进行完整遍历。你也可以重构where子句，如果这仍然没有按照你想要的方式执行。

SELECT *
FROM ImportData t2
WHERE NOT EXISTS (
    select 1 
    from Destination t1
    where (t2.Tel1!='' AND (t2.Tel1 IN (t1.Tel1,t1.Tel2,t1.Tel3)))
          or
          (t2.Tel2!='' AND (t2.Tel2 IN (t1.Tel1,t1.Tel2,t1.Tel3)))
          or
          (t2.Tel3!='' AND (t2.Tel3 IN (t1.Tel1,t1.Tel2,t1.Tel3)))
    )

Answer 3

我不确定这个查询的性能，但是因为我努力写了它，所以我会发布它...

;with aaa(tel)
as
(
select Tel1
from Destination
union
select Tel2
from Destination
union
select Tel3
from Destination
)
,bbb(tel, id)
as
(
select Tel1, account_id
from ImportData
union
select Tel2, account_id
from ImportData
union
select Tel3, account_id
from ImportData
)

select distinct b.id
from bbb b
where b.tel in
(
select a.tel
from aaa a
intersect
select b2.tel
from bbb b2
)

查找两个表之间的重复项

3 个答案: