为什么使用索引扫描而不是索引查找?

时间:2013-05-20 18:38:33

标签: sql sql-server clustered-index

我在两个必须加入的大型表上创建了唯一的聚簇索引,但这些索引不会被查询计划证明。当我强制使用带有提示的索引时,会使用索引扫描并且性能会变差。一个表的唯一键是第二个表的外键,所以这让我很困惑。这是架构。这两个表格为LOCPOLLOC有7百万多行,POL有超过600万行。

CREATE TABLE [dbo].[LOC](
    [acct_num] [char](30) NOT NULL,
    [cntr_num] [char](30) NOT NULL,
    [lob_cde] [char](2) NOT NULL,
    [ste_locn_nme] [char](30) NOT NULL,
    [buldg_num] [char](20) NOT NULL,
    [prctr_cde] [char](3) NULL,
        ...more fields...
) ON [PRIMARY]

CREATE NONCLUSTERED INDEX [IDX_All] ON [dbo].[LOC] 
(
    [acct_num] ASC,
    [cntr_num] ASC,
    [lob_cde] ASC,
    [buldg_num] ASC,
    [prctr_cde] ASC,
    [spcl_cond_1_id] ASC,
    [spcl_cond_2_id] ASC,
    [spcl_cond_3_id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]

CREATE UNIQUE CLUSTERED INDEX [IDX_LOC_PKEY] ON [dbo].[LOC] 
(
    [acct_num] ASC,
    [lob_cde] ASC,
    [prctr_cde] ASC,
    [cntr_num] ASC,
    [ste_locn_nme] ASC,
    [buldg_num] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]

... more non-unique indices on LOC, each on single columns: buldg_num, prctr_cde, acct_num

CREATE TABLE [dbo].[POL](
    [acct_num] [char](30) NOT NULL,
    [cntr_num] [char](30) NOT NULL,
    [lob_cde] [char](2) NOT NULL,
    [prctr_cde] [char](3) NULL,
        ...more fields...
) ON [PRIMARY]

CREATE NONCLUSTERED INDEX [IDX_All] ON [dbo].[POL] 
(
    [acct_num] ASC,
    [cntr_num] ASC,
    [lob_cde] ASC,
    [prctr_cde] ASC,
    [acct_nme] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]

CREATE UNIQUE CLUSTERED INDEX [IDX_POL_PKEY] ON [dbo].[POL] 
(
    [acct_num] ASC,
    [lob_cde] ASC,
    [prctr_cde] ASC,
    [cntr_num] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]

... more non-unique indices on POL, each on single columns: cntr_num, prctr_cde

如您所见,POL表的唯一键(acct_num, lob_cde, prctr_cde, cntr_num)的四个字段是LOC表主键的前四列。这是我想要运行的一个查询,就像许多将加入这两个表的查询一样:

select 
      [Easy matches] = COUNT(*)
FROM LOC INNER JOIN POL ON (
        LOC.acct_num = POL.acct_num 
    AND LOC.lob_cde = POL.lob_cde 
    AND LOC.prctr_cde = POL.prctr_cde
    AND LOC.cntr_num = POL.cntr_num)

如果没有提示,则喜欢使用每个表中的IDX_prctr_cde索引。 prctr_cde列不是很有选择性; LOCPOL表中只有七个不同的值。如果我提示查询应该使用IDX_cntr_num索引,我会获得良好的性能,因为它是一个高度选择性的列(每个表中有超过600万个不同的值)。 acct_num几乎与cntr_num一样具有选择性,同时还有超过600万个不同的值。

为什么默认使用非选择性索引?为什么切换到使用唯一的聚簇索引使查询运行得慢得多? (慢10倍,20倍甚至30倍。)

注意:我使用的提示是:

OPTION ( 
        TABLE HINT(POL, INDEX (IDX_POL_PKEY)),
        TABLE HINT(LOC, INDEX (IDX_LOC_PKEY))
       )

注意:我使用的是SQL Server 2005和SQL Server 2008。

0 个答案:

没有答案