Question

我正在使用弹性搜索进行测试，以索引维基百科的主题。

在我的设置下方。

结果我希望第一个结果与精确字符串匹配 - 特别是如果字符串只由一个字组成。

相反：

搜索“g”

curl "http://localhost:9200/my_index/_search?q=name:g&pretty=True"

返回 [Changgyeonggung，Lopadotemachoselachogaleokranioleipsanodrimhypotrimmatosilphioparaomelitokatakechymenokichlepikossyphophattoperisteralektryonoptekephalliokigklopeleiolagoiosiobaphetraganopterygon，..]作为第一个结果（是的，偶然的时间！如果你好奇的话，这是一道希腊菜[http://nifty.works/about/BgdKMmwV6B3r4pXJ/] :)

我认为因为结果比其他词更重“G”字母..但是：

搜索“google”：

curl "http://localhost:9200/my_index/_search?q=name:google&pretty=True"

返回

[Googlewhack，IGoogle，Google +，Google，...]作为第一个结果，我希望Google成为第一个。

我的设置中有什么错误，如果存在，则没有点击确切的关键字？

我使用了索引和搜索分析器，原因在于这个答案：{https://stackoverflow.com/a/15932838/305883]

设置

# make index with mapping
curl -X PUT localhost:9200/test-ngram -d '
{
  "settings": {
      "analysis": {
          "analyzer": {
              "index_analyzer": {
                  "type" : "custom",
                  "tokenizer": "lowercase",
                  "filter": ["asciifolding", "title_ngram"]
              },
              "search_analyzer": {
                  "type": "custom",
                  "tokenizer": "standard",
                  "filter": ["standard", "lowercase", "stop", "asciifolding"]
              }
          },
      "filter": {
          "title_ngram" : {
            "type" : "nGram",
            "min_gram" : 1,
            "max_gram" : 10
            }
          }
      }
  },

  "mappings": {
    "topic": {
      "properties": {
        "name": {
          "type": "string",
          "boost": 10.0,
          "index": "analyzed",
          "index_analyzer": "index_analyzer",
          "search_analyzer": "search_analyzer"
        }
      }
    }
  }
}
'

Answer 1

这是因为默认情况下相关性以不同的方式工作（检查有关TF / IDF的部分 https://www.elastic.co/guide/en/elasticsearch/guide/current/relevance-intro.html）如果你想在结果的顶部有完全的术语匹配，同时也匹配子串等，你需要将名称索引为多字段，如下所示：

"name": {
    "type": "string",
    "index": "analyzed",
    // other analyzer stuff here
    "fields": {
        "raw":   { "type": "string", "index": "not_analyzed" }
    }
}

然后在布尔查询中，你需要同时查询name.raw和来自name.raw的结果

弹性搜索中的索引和搜索分析器：作为第一个结果命中精确字符串的麻烦

1 个答案: