Question

我的表有2列，ID1，ID2 ID1和ID2是一对多关系，但ID2在ID1中可能不是连续的。给定ID1和n，我应该按ID1过滤表，然后每隔第n行返回一次结果。例如，如果ID1为1，并且在我对其进行过滤后，其下的ID2为1 2 3 4 7 8 9 11 12 并且n是3，结果应该返回1,4,9

我使用的是SQL Server 2012

我编写了以下查询，但运行缓慢。对于一个ID1，我们有超过1M的ID2。我们想要在查询中使用的n是100K到250K。查询的运行时间目前为600ms-1200ms，这对我们的项目不利。有没有改进查询的方法？我想让每个查询的运行时间低于500毫秒。

declare @now datetime = getutcdate()

declare @ID1 INT = 1518

declare @Size INT = 100000
;

select t.ID1,  t.ID2 from (

select ID1, ID2, row_number() over(order by Id2) as rownum
from table1 where ID1 = @ID1) as t
where t.rownum%@Size=1

select datediff(ms, @now, getutcdate())

由于

Answer 1

ID1上的聚集索引就是你所希望的。这将有助于限制IO并仅扫描到@ ID1。其他任何东西都不够稳定（@Size或ID2行号）。

请记住，即使这里的聚集索引也可能比其他地方修复得更多。

Answer 2

由于您已经设置了测试数据，请尝试这样的操作。如果它更快，我们可以根据@size

生成@rowid

declare @now datetime = getutcdate()

declare @ID1 INT = 1518
--declare @Size INT = 100000
declare @rowid table(id INT) 
insert into @rowid VALUES (1),(100001),(200001),(300001),(400001),(500001),(600001),(700001),(800001),(900001),(1000001),(1100001),(1200001),(1300001),(1400001),(1500001)

select t.ID1, t.ID2 from (

select ID1
     , ID2
     , row_number() over(order by Id2) as rownum
  from table1 
 where ID1 = @ID1
) as t
where EXISTS(SELECT 1 FROM @rowid a WHERE a.id = t.rownum)

select datediff(ms, @now, getutcdate())

按ID过滤后，优化获取表中的每第n行

2 个答案: