带有前缀和干扰词的SQL Server全文搜索

时间:2014-06-02 16:04:21

标签: sql sql-server full-text-search prefix stop-words

我不明白为什么跟随Query1和Query2不会返回类似的结果集。

样品

查询1:

select * from dbo.tFTS_test where contains (*, '"qwe-asd*"')

返回:

Id          Value
----------- ----------

(0 row(s) affected)

QUERY2:

select * from dbo.tFTS_test where contains (*, '"qwe-asd"')

返回:

Id          Value
----------- ----------
Informational: The full-text search condition contained noise word(s).
1           qwe-asd

(1 row(s) affected)

表:

select * from dbo.tFTS_test

返回:

Id          Value
----------- ----------
1           qwe-asd

(1 row(s) affected)

以下是一些帮助查询

帮助者查询1:

select * from sys.dm_fts_index_keywords_by_document (db_id(), object_id('dbo.tFTS_test'))

返回:

keyword                         display_term  column_id   document_id          occurrence_count
------------------------------- ------------- ----------- -------------------- ----------------
0x007100770065                  qwe           2           1                    1
0x007100770065002D006100730064  qwe-asd       2           1                    1
0xFF                            END OF FILE   2           1                    1

(3 row(s) affected)

帮助者查询2:

select p.*
from sys.fulltext_stoplists s
cross apply sys.dm_fts_parser ('"qwe-asd"', 1033, s.stoplist_id, 0) p
where s.name = 'FTS_test_stoplist'

返回:

keyword                         group_id    phrase_id   occurrence  special_term     display_term  expansion_type source_term
------------------------------- ----------- ----------- ----------- ---------------- ------------- -------------- -----------
0x007100770065002D006100730064  1           0           1           Exact Match      qwe-asd       0              qwe-asd
0x007100770065                  1           0           1           Exact Match      qwe           0              qwe-asd
0x006100730064                  1           0           2           Noise Word       asd           0              qwe-asd

(3 row(s) affected)

帮助者查询3:

select p.*
from sys.fulltext_stoplists s
cross apply sys.dm_fts_parser ('"qwe-asd*"', 1033, s.stoplist_id, 0) p
where s.name = 'FTS_test_stoplist'

返回:

keyword                         group_id    phrase_id   occurrence  special_term     display_term  expansion_type source_term
------------------------------- ----------- ----------- ----------- ---------------- ------------- -------------- -----------
0x007100770065002D006100730064  1           0           1           Exact Match      qwe-asd       0              qwe-asd
0x007100770065                  1           0           1           Exact Match      qwe           0              qwe-asd
0x006100730064                  1           0           2           Exact Match      asd           0              qwe-asd

(3 row(s) affected)

结构如下:

-- ****************************
-- Step 1. Cleanup FTS Structure
-- ****************************

if exists (select 1 from sys.fulltext_indexes where object_id = object_id('dbo.tFTS_test'))
    drop fulltext index on dbo.tFTS_test;
go
if exists (select 1 from sys.fulltext_catalogs where name = 'FTS_test')
    drop fulltext catalog FTS_test;
go
if exists (select 1 from sys.fulltext_stoplists where name = 'FTS_test_stoplist')
    drop fulltext stoplist FTS_test_stoplist;
go
if object_id ('dbo.tFTS_test') is not null
    drop table dbo.tFTS_test;
go

-- ****************************
-- Step 2. Create FTS Structure
-- ****************************

create table dbo.tFTS_test (
    Id int not null,
    Value varchar(100) not null,
    constraint [PK_tFTS_test] primary key clustered (Id asc)
);
go
create fulltext stoplist FTS_test_stoplist from system stoplist;
go
alter fulltext stoplist FTS_test_stoplist add 'asd' language 'English';
go
create fulltext catalog FTS_test with accent_sensitivity = off;
go
create fulltext index on dbo.tFTS_test (Value language English) key index PK_tFTS_test on (FTS_test);
go
if not exists (
    select 1
    from sys.fulltext_indexes i
    inner join sys.fulltext_stoplists l on l.stoplist_id = i.stoplist_id
    where i.object_id = object_id('dbo.tFTS_test') and l.name = 'FTS_test_stoplist'
)
    alter fulltext index on dbo.tFTS_test set stoplist FTS_test_stoplist;
go
insert into dbo.tFTS_test (Id, Value) values (1, 'qwe-asd');
go

<子> P.S。抱歉这么大的问题。

1 个答案:

答案 0 :(得分:0)

差异是由连字符引起的,全文搜索将您的查询字符串视为两个单词而不是一个单词。另外,因为&#34; asd&#34;是一个它找不到的噪音词。

当断字符在术语中遇到断字符时,断字符会将该字符解析为空格字符。

分词符号包括以下内容:

  • •$(美元符号)
  • ,(逗号)
  • &安培; (&符号)
  • #(数字符号)

当断字符在术语中遇到连字符( - )时,断字符会正确解析该术语。但是,全文同义词库组件将连字符连接的字符与连字符本身视为空字符。例如,如果原来的名词是&#34;着名的名人,&#34;这个词出现在&#34;名人&#34;在同义词库文件中。

这是来自Microsoft网站,不是同一个问题,而是一个共享根本原因的问题:

You obtain incorrect results when you run a full-text search query that uses a thesaurus file in SQL Server 2005