Elasticsearch。 SKU分析和搜索

时间:2015-12-30 09:22:26

标签: elasticsearch query-analyzer

这让我很生气,我已经尽力了。 这就是事情。我需要:

  1. 将所有类似的俄语字母转换为英语(同时分析和搜索请求)
  2. 删除所有非字母和非数字
  3. 使用ngram案例搜索令牌可以来自字符串的任何地方
  4. 例如你可以搜索8009,我有ALK-8009和ALK-8022 sku's,我无法理解为什么ALK-8022超过ALK-8009。

          index_options: {
            settings: {
              index: {
                analysis: {
                  char_filter: {
                    russian_transliteration: {
                      type: "mapping",
                      mappings: ["а=>a",
                                 "в=>b",
                                 "е=>e",
                                 "к=>k",
                                 "м=>m",
                                 "н=>h",
                                 "о=>o",
                                 "р=>p",
                                 "с=>c",
                                 "т=>t",
                                 "у=>u",
                                 "х=>x"]
                    },
                    pattern_replace_char_filter: {
                      type: "pattern_replace",
                      pattern: "(\\s|[-_\/\.\(\)])*",
                      replacement: ""
                    }
                  },
                  tokenizer: {
                    sku_tokenizer: {
                      type: "nGram",
                      min_gram: 4,
                      max_gram: 15
                    },
                    sku_search_tokenizer: {
                      type: "edgeNGram",
                      min_gram: 4,
                      max_gram: 15
                    }
                  },
                  analyzer: {
                    sku_analyzer: {
                      type: "custom",
                      tokenizer: "sku_tokenizer",
                      char_filter: ["russian_transliteration","pattern_replace_char_filter"],
                      filter: ['lowercase']
    
                    }
                    },
                    sku_search_analyzer: {
                      type: "custom",
                      tokenizer: "sku_search_tokenizer",
                      char_filter: ["russian_transliteration","pattern_replace_char_filter"],
                      filter: ['lowercase']
                    }
                  }
                }
              }
            }
          },
          index_mappings: {
             sku: {
                  type: 'string',
                  analyzer: 'sku_analyzer',
                  fields: {
                    search: {type: 'string', analyzer: 'sku_search_analyzer'},
                    suggest:  {type: 'completion'}
                  }
              }
          }
    

    这是我的搜索查询:

    {query: 
      {bool: 
        {should: 
          [{prefix: {sku: {value: "SEARCH-STRING", boost: 2}}}, 
           {match: {sku: {query: "SEARCH-STRING", boost: 1, fuzziness: 0}}}]
    }}}
    

    我期望拥有的,只有那些在SKU中拥有完整搜索字符串的结果,而不仅仅是部分。

    例如,ALK-80 - 将转换为alk80,只有那些结果才是我需要的。

0 个答案:

没有答案