如何在MATCH / AGAINST中消除对较短行的偏见?

时间:2011-12-07 22:22:01

标签: mysql match against

我正在使用MySQL中MyISAM表中的一个简单搜索界面,即实现MATCH / AGAINST程序。

乍一看似乎没问题,但经过进一步检查,似乎偏向较短的行长。我只能想象这是因为给出的分数必须更高,因为匹配的单词百分比更高。

以下是我正在使用的MySQL数据库的查询,结果来自下面屏幕截图中的应用程序。

SELECT 
            report, 
            status,
            GROUP_CONCAT(DISTINCT status) AS statuses, 
            GROUP_CONCAT(DISTINCT docID) AS docIDs, 
            GROUP_CONCAT(DISTINCT analyst) AS analysts, 
            GROUP_CONCAT(DISTINCT region) AS regions, 
            GROUP_CONCAT(DISTINCT country) AS countries, 
            GROUP_CONCAT(DISTINCT topic) AS topics, 
            GROUP_CONCAT(DISTINCT date) AS dates, 
            MAX(date) AS date,
            MIN(date) AS mindate,
            MAX(docID) AS docID, 
            GROUP_CONCAT(DISTINCT event) AS events, 
            GROUP_CONCAT(DISTINCT rule) AS rules, 
            GROUP_CONCAT(DISTINCT link SEPARATOR ' ') AS links, 
            GROUP_CONCAT(DISTINCT province) AS provinces,
            MATCH (
                region, country, province, topic, event
            )
            AGAINST (
                'toxic china'
            ) AS score
            FROM search_reports
            GROUP BY report
            ORDER BY score DESC

为了简单起见,我在解决这个问题的过程中刚刚离开AGAINST作为常数。目前它只能搜索“有毒中国”。因此,出乎意料的是,一些不包含中国的结果的排名高于那些包含该特定搜索关键字的结果。

Search Results

1 个答案:

答案 0 :(得分:1)

你可能想尝试IN BOOLEAN MODE:

AGAINST (
        'toxic china' IN BOOLEAN MODE
)

因为这应该是术语

上的真/假匹配