Question

我试图找出最优化的SQL查询来实现以下目标。

我有一个包含ZipCodes / PostalCodes的表，让我们假设以下结构：

table_codes：

ID   |  ZipCode
---------------
1       1234
2       1235
3       456

等等。

我的应用程序的用户填写了他们需要输入ZipCode（PostalCode）的配置文件。假设有时，用户将输入我的表中未定义的ZipCode，我试图根据用户输入的zip建议最佳匹配。

我使用以下查询：

Declare @entered_zipcode varchar(10)
set @entered_zipcode = '23456'


SELECT TOP 1 table_codes.ZipCode
FROM    table_codes
where   @entered_zipcode  LIKE table_codes.ZipCode + '%'
or table_codes.ZipCode + '%' like @entered_zipcode  + '%'
ORDER BY table_codes.ZipCode, LEN(table_codes.ZipCode) DESC

基本上，我正在尝试以下方法：

如果@entered_zipcode比表格中的任何邮政编码长，我试图获得与@entered_zipcode
如果@entered_zipcode比表格中的任何现有代码都短，我会尝试将其用作前缀并在表格中获得最佳匹配

此外，我正在构建一个具有以下结构的临时表：

#tmpTable
------------------------------------------------------------------------------------
ID | user1_enteredzip | user1_bestmatchzip | user2_enteredzip | user2_bestmatchzip |
------------------------------------------------------------------------------------
1  |   12             |     *1234*         |       4567       |       **456**      |
2  |
3  |
4  |

输入的zip是用户输入的，* .. *之间的代码是我查找表中最匹配的代码，我试图使用下面的查询。

查询似乎有点长，这就是为什么我要求帮助优化它：

        update  #tmpTable
        set     user1_bestmatchzip = ( SELECT TOP 1
                                            zipcode
                                    FROM    table_codes
                                    where   #tmpTable.user1_enteredzip  LIKE table_codes.zipcode + '%'
                                            or table_codes.zipcode + '%' like #tmpTable.user1_enteredzip + '%'
                                    ORDER BY table_codes.zipcode, LEN(table_codes.zipcode) DESC
                                  ),
                user2_bestmatchzip = ( SELECT TOP 1
                                            zipcode
                                    FROM    table_codes
                                    where   #tmpTable.user2_enteredzip  LIKE table_codes.zipcode + '%'
                                            or table_codes.zipcode + '%' like #tmpTable.user2_enteredzip + '%'
                                    ORDER BY table_codes.zipcode, LEN(table_codes.zipcode) DESC
                                  )
         from #tmpTable

Answer 1

如果您将临时表更改为：

，该怎么办？

id  |  user | enteredzip | bestmatchzip 
10  |  1    | 12345      | 12345
20  |  2    | 12         | 12345

即：使用列保存用户编号（1或2）。这样，您将一次更新一行。

此外，ORDER BY需要时间，您是否在邮政编码上设置了索引？你不能在zipcodes表中创建一个字段“length”来预先计算zipcodes长度吗？

修改我以为按LEN排序毫无意义，你可以删除它！如果邮政编码不能重复，那么就可以通过邮政编码进行排序。如果他们可以，LEN将永远是平等的！

Answer 2

您正在比较两个字符串的第一个字符 - 如果比较最小长度的子字符串会怎么样？

select top 1 zipcode
from table_zipcodes
where substring(zipcode, 1, case when len(zipcode) > len (@entered_zipcode) then len(@entered_zipcode) else len (zipcode) end) 
    = substring (@entered_zipcode, 1, case when len(zipcode) > len (@entered_zipcode) then len(@entered_zipcode) else len (zipcode) end) 
order by len (zipcode) desc

这将删除OR并允许使用index * in_ @input_zipcode LIKE table_codes.ZipCode +'％'*。此外，在我看来，结果的排序是错误的 - 较短的zipcodes首先。

带有更新的SQL Server最佳匹配查询（T-SQL）

2 个答案: