MySQL全文搜索:使用距离与通配符

时间:2018-04-19 14:30:19

标签: mysql full-text-search wildcard distance

我的数据如下:

[column "content"]
The quick red horse jumps over the quick dog
The quick brown horse
The quick brown horse jumps over the lazy dog
The quick brown horses jumps over the dog
quick as a mouse was the spider. The horse is brown.

我使用MATCH和AGAINST获取马和马的所有行。所以,我知道,通配符*在BOOLEAN MODE中工作。

SELECT * FROM news
WHERE   (MATCH (content) AGAINST ('+quick +horse*' IN BOOLEAN MODE));

通过下一个查询,我得到所有的行" horse" (复数)和"快速",距离最多为3。

SELECT * FROM news
WHERE   (MATCH (content) AGAINST  ('"quick horses" @3' IN BOOLEAN MODE));

将两者结合在一起:所有马或马和"快速",距离最多为3。

SELECT * FROM news
WHERE   (MATCH (content) AGAINST  ('"quick horse*" @3' IN BOOLEAN MODE));

在结果集中只包含" horse"。 "马"不包括在内!

完整文档请参阅:http://sqlfiddle.com/#!9/033e02/6

有人有任何想法吗?

1 个答案:

答案 0 :(得分:0)

在发现它是MySQL中记录的BUG后,我搜索了另一种方式。错误:https://bugs.mysql.com/bug.php?id=80723

想法:RegEx。 再次,这是一条艰难的道路,因为MySQL目前只支持一部分常用表达式。

Reference to groups in a MySQL regex?

https://dev.mysql.com/doc/refman/5.5/en/regexp.html

经过多次实验,这对我有用。 http://sqlfiddle.com/#!9/7007ac7/1

SELECT * FROM news
WHERE content REGEXP 'quick([[:space:][:punct:]])*(((([[:alnum:]])*)*[[:space:][:punct:]]){1,3})horse';