将数据存储在两个CSV字符串与两个db表中,以便进行最快速的比较

时间:2017-08-14 18:10:09

标签: sql sql-server csv

场景是我们有两个列表:

A:23,45,g5,33

B:11,12,45,g9

我们希望SQL SERVER中最快的机制来查看A中是否有任何B值,在这个示例中,45是在A中,因此它必须返回true。

解决方案应描述存储列表的方式(CSV,表格等)和比较机制。

每个列表相对较小(每个列表平均10个值),但是进行了多次比较(写入次数很少,读取次数很多)

3 个答案:

答案 0 :(得分:0)

我仍然对核心理念感到困惑......但这是一个比逗号分隔列表更好的简单解决方案。当然,创建索引会使速度更快。它比循环更快。

var id = Property.N(p => p.PropertyId) ?? 0;

分道

declare @table table (id char(4), v varchar(256))
insert into @table
values
('A','23'),
('A','45'),
('A','g5'),
('A','33'),
('B','11'),
('B','12'),
('B','45'),
('B','g9')


select distinct
    base.v
    --,base.*
    --,compare.*
from 
    @table base
inner join
    @table compare
    on compare.v = base.v
    and compare.id <> base.id

<强> USING THIS FUNCTION

declare @table table (id char(4), v varchar(256))
insert into @table
values
('A','23,45,g5,33'),
('B','11,12,45,g9')

;with cte as(
    select 
        t.ID
        ,base.Item
    from 
        @table t
        cross apply dbo.DelimitedSplit8K(t.v,',') base)

select
    t.Item
from
    cte t
inner join
    cte x on 
    x.Item = t.Item
    and x.id <> t.id
where
    t.id = 'A'

答案 1 :(得分:0)

如果您遇到分隔字符串,请考虑以下事项:

示例:

Declare @YourTable Table ([ColA] varchar(50),[ColB] varchar(50))
Insert Into @YourTable Values 
 ('23,45,g5,33' ,'11,12,45,g9')
,('no,match'    ,'found,here')


Select * 
 from @YourTable A
 Cross Apply (
                Select Match=IsNull(sum(1),0)
                 From  [dbo].[udf-Str-Parse-8K](ColA,',') B1
                 Join  [dbo].[udf-Str-Parse-8K](ColB,',') B2 on B1.RetVal=B2.RetVal
             ) B

<强>返回

ColA          ColB          Match
23,45,g5,33   11,12,45,g9   1
no,match      found,here    0

UDF如果有兴趣

CREATE FUNCTION [dbo].[udf-Str-Parse-8K] (@String varchar(max),@Delimiter varchar(25))
Returns Table 
As
Return (  
    with   cte1(N)   As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)),
           cte2(N)   As (Select Top (IsNull(DataLength(@String),0)) Row_Number() over (Order By (Select NULL)) From (Select N=1 From cte1 a,cte1 b,cte1 c,cte1 d) A ),
           cte3(N)   As (Select 1 Union All Select t.N+DataLength(@Delimiter) From cte2 t Where Substring(@String,t.N,DataLength(@Delimiter)) = @Delimiter),
           cte4(N,L) As (Select S.N,IsNull(NullIf(CharIndex(@Delimiter,@String,s.N),0)-S.N,8000) From cte3 S)

    Select RetSeq = Row_Number() over (Order By A.N)
          ,RetVal = LTrim(RTrim(Substring(@String, A.N, A.L)))
    From   cte4 A
);
--Orginal Source http://www.sqlservercentral.com/articles/Tally+Table/72993/
--Select * from [dbo].[udf-Str-Parse-8K]('Dog,Cat,House,Car',',')
--Select * from [dbo].[udf-Str-Parse-8K]('John||Cappelletti||was||here','||')

答案 2 :(得分:0)

根据之前的回答,我认为它应该是这样的:

declare @table table (id char(4), v varchar(256))
insert into @table
values
('A','23'),
('A','45'),
('A','g5'),
('A','33'),
('B','11'),
('B','12'),
('B','45'),
('B','g9')


if exists( select count(1)
        from 
            @table base
        inner join
            @table compare
            on compare.v = base.v
            and base.id='A' and compare.id='B') 
 print 'true'
 else
 print 'false'

id,v或v,id的索引取决于数据的增长