更快地找到重复的SQL查询

时间:2009-08-17 05:01:11

标签: sql sql-server sql-server-2005 tsql optimization

我有一个查询来获取具有一些额外条件的重复数据,但我觉得它不够快。任何使这个查询更快的解决方案?

v_listing包含重要信息

SELECT DISTINCT  code, name, comm, address, area 
FROM v_listing t1
WHERE EXISTS (SELECT NULL
                FROM v_listing t2
                WHERE t1.comm = t2.comm
                AND t1.address = t2.address
                AND t1.area = t2.area
                AND (t1.code > t2.code OR t1.code < t2.code))
ORDER BY comm, address, area

2 个答案:

答案 0 :(得分:3)

exists子句执行半连接,这不是比较两个非常大的表的最佳方法。在这种情况下,它是一个表,但重点是。你想要做的是inner join

SELECT DISTINCT  
    t1.code, 
    t1.name, 
    t1.comm, 
    t1.address, 
    t1.area 
FROM 
    v_listing t1
    inner join v_listing t2 on
        t1.comm = t2.comm
        AND t1.address = t2.address
        AND t1.area = t2.area
        AND t1.code <> t2.code
ORDER BY t1.comm, t1.address, t1.area

还要确保所有连接列都有索引。这也将极大地提高速度。

答案 1 :(得分:0)

仅此一项改变应该有很多帮助:

SELECT DISTINCT code, name, comm, address, area
FROM v_listing t1
WHERE EXISTS ( SELECT NULL
        FROM v_listing t2
            WHERE t1.comm = t2.comm
            AND t1.address = t2.address
            AND t1.area = t2.area
            AND t1.code <> t2.code)
ORDER BY comm, address, area

或者,你可以这样做:

SELECT comm, address, area, MIN(code), MAX(code), MIN(name), COUNT(*)
FROM v_listing t1
GROUP BY comm, address, area
HAVING COUNT(*) > 2
ORDER BY comm, address, area