sys.dm_fts_parser点问题

时间:2013-02-11 19:52:26

标签: sql-server-2008 tsql full-text-search

你能帮助我吗,请弄清楚为什么t-sql表达式如下:

  DECLARE @SearchWord varchar(max)
SET @SearchWord = '"I went to primary school in London "'
SELECT * FROM sys.dm_fts_parser('FormsOf(INFLECTIONAL, '+ @SearchWord + ')', 1033, 0, 0) 
where display_term in 
( SELECT display_term FROM sys.dm_fts_parser('FORMSOF(INFLECTIONAL, "go to school")', 1033, null, 0) )

返回

enter image description here

,而

DECLARE @SearchWord varchar(max)
SET @SearchWord = '"I went. to primary school in London "'
SELECT * FROM sys.dm_fts_parser('FormsOf(INFLECTIONAL, '+ @SearchWord + ')', 1033, 0, 0) 
where display_term in 
( SELECT display_term FROM sys.dm_fts_parser('FORMSOF(INFLECTIONAL, "go to school")', 1033, null, 0) )

返回

enter image description here

即。当在搜索字符串中的某处添加单个点时,相应的事件会移动8个位置?点或者我的t-sql表达式有问题吗? 提前谢谢!

2 个答案:

答案 0 :(得分:2)

首先,documentation表示occurrence表示订单,而不是位置。这意味着值可以是相对的,而不是绝对的,但仍然可以正确显示顺序。

接下来,通过观察,数字的第一个数字表示从零开始的句子编号(第一个句子除外,其中没有前导零)。 “点”实际上是一个full stop,用英语结束一个句子,因此有一些重要的事情就不足为奇了。查看此查询的输出,您将看到“句末”特殊术语:

DECLARE @SearchWord varchar(max) = N'"I went. to primary school in London. it was a nice school. to go there was fun"'
SELECT * 
FROM sys.dm_fts_parser('FormsOf(INFLECTIONAL, '+ @SearchWord + ')', 1033, 0, 0)

如果你用较长的句子查看你的查询......

DECLARE @SearchWord varchar(max) = N'"I went. to primary school in London. it was a nice school. to go there was fun"'
SELECT * 
FROM sys.dm_fts_parser('FormsOf(INFLECTIONAL, '+ @SearchWord + ')', 1033, 0, 0)
where display_term in 
(SELECT display_term FROM sys.dm_fts_parser('FORMSOF(INFLECTIONAL, "go to school")', 1033, null, 0))

...你可以看到,对于句子1和2,occurrence确实也是单词的位置,但是对于句子3和4,它不是。我不知道为什么会发生这种情况,文档中没有任何内容可以解释它,但由于文档没有说occurrence与位置相同,所以并不奇怪。

这些问题也可能很有趣:

答案 1 :(得分:0)

由于@Pondlife建议使用PATINDEX()来查找位置(不是出现位置),我能够用dm_fts_parser克服上述问题。无论@SearchWord是否包含点,下面的t-sql都会返回特定文本中所有匹配表单(INFLECTIONAL)的确切位置和长度:

DECLARE @SearchWord nvarchar(max)

SET @SearchWord = N'"I went. to primary school in London "'

SELECT distinct y.pos,y.lgth from 
(
SELECT w.*,
PATINDEX(N'%[^a-z]' + w.Display_Term + N'[^a-z]%',@SearchWord) as pos, LEN(w.display_term) as lgth
 FROM sys.dm_fts_parser(N'FormsOf(INFLECTIONAL, '+ @SearchWord + ')', 1033, 0, 0) w 
where display_term in 
( SELECT display_term FROM sys.dm_fts_parser('FORMSOF(INFLECTIONAL, "go to school")', 1033, 0, 0) ) and 
PATINDEX(N'%[^a-z]' + w.Display_Term + N'[^a-z]%',
@SearchWord)<>0
) y

并返回以下结果集:

enter image description here