Elasticsearch Nest无法正确匹配

时间:2016-03-31 13:06:44

标签: elasticsearch nest

使用Elasticsearch和Nest 2.x

基于一些(疯狂的)用户需求,我需要将所有可搜索字段复制到单个字段,小写字母,并忽略空格。当用户键入要搜索的内容时,我将其小写并删除用作搜索字符串的空格。

举个例子: "快速的棕色狐狸" ...在弹性搜索中我希望这是" thequickbrownfox"用于搜索目的。

以下搜索应与上述文件相符:

  • thequick
  • rown
  • NF

以下是我建立索引的方法:

var customerSearchIdxDesc = new CreateIndexDescriptor(Constants.ElasticSearch.CustomerSearchIndexName)
    .Settings(f =>
        f.Analysis(analysis => analysis
                .Analyzers(analyzers => analyzers
                    .Custom(Constants.ElasticSearch.AnalyzerNames.LowercaseNGram, a => a
                        .Filters("lowercase")
                        .Tokenizer(Constants.ElasticSearch.TokenizerNames.NoWhitespaceNGram)))
                .Tokenizers(tokenizers => tokenizers
                        .NGram(Constants.ElasticSearch.TokenizerNames.NoWhitespaceNGram, t => t
                            .MinGram(1)
                            .MaxGram(500)
                            .TokenChars(TokenChar.Digit, TokenChar.Letter, TokenChar.Punctuation, TokenChar.Symbol)
                        )
                )
        )
    )
    .Mappings(ms => ms.Map<ServiceModel.DtoTypes.Customer.SearchResult>(m => m
        .AutoMap()
        .Properties(p => p
            .String(n => n.Name(c => c.CustomerName).CopyTo(f =>
            {
                return new FieldsDescriptor<string>().Field("search");
            }).Index(FieldIndexOption.Analyzed).Analyzer(Constants.ElasticSearch.AnalyzerNames.LowercaseNGram))
            .String(n => n.Name(c => c.ContactName)
                        .CopyTo(fs => fs.Field(Constants.ElasticSearch.CombinedSearchFieldName))
                        .Index(FieldIndexOption.Analyzed)
                        .Analyzer(Constants.ElasticSearch.AnalyzerNames.LowercaseNGram))
            .String(n => n.Name(c => c.CustomerName)
                        .CopyTo(fs => fs.Field(Constants.ElasticSearch.CombinedSearchFieldName))
                        .Index(FieldIndexOption.Analyzed)
                        .Analyzer(Constants.ElasticSearch.AnalyzerNames.LowercaseNGram))
            .String(n => n.Name(c => c.City)
                        .CopyTo(fs => fs.Field(Constants.ElasticSearch.CombinedSearchFieldName))
                        .Index(FieldIndexOption.Analyzed)
                        .Analyzer(Constants.ElasticSearch.AnalyzerNames.LowercaseNGram))
            .String(n => n.Name(c => c.StateAbbreviation)
                        .CopyTo(fs => fs.Field(Constants.ElasticSearch.CombinedSearchFieldName))
                        .Index(FieldIndexOption.Analyzed)
                        .Analyzer(Constants.ElasticSearch.AnalyzerNames.LowercaseNGram))
            .String(n => n.Name(c => c.Country)
                        .CopyTo(fs => fs.Field(Constants.ElasticSearch.CombinedSearchFieldName))
                        .Index(FieldIndexOption.Analyzed)
                        .Analyzer(Constants.ElasticSearch.AnalyzerNames.LowercaseNGram))
            .String(n => n.Name(c => c.PostalCode)
                        .CopyTo(fs => fs.Field(Constants.ElasticSearch.CombinedSearchFieldName))
                        .Index(FieldIndexOption.Analyzed)
                        .Analyzer(Constants.ElasticSearch.AnalyzerNames.LowercaseNGram))
            .String(n => n.Name(Constants.ElasticSearch.CombinedSearchFieldName)
                        .Index(FieldIndexOption.Analyzed)
                        .Analyzer(Constants.ElasticSearch.AnalyzerNames.LowercaseNGram))
            )
        )
    );

正如您所看到的,我在分析器上使用小写过滤器,并使用TokenChars,因此省略了空白(嗯,这就是想法,它不起作用)。

这是我用来搜索的内容:

var response = client.Search<DtoTypes.Customer.SearchResult>(s =>
    s.From(0)
    .Take(Constants.ElasticSearch.MaxResults)
    .Query(q => q
        .MatchPhrase(mp => mp
            .Field(Constants.ElasticSearch.CombinedSearchFieldName)
            .Query(query))));

所以问题就在这里:

  • 似乎没有省略空格(看起来它只匹配单词)
  • 部分匹配似乎仅适用于后缀。例如。搜索&#34; aby&#34;不会匹配&#34; abyss&#34;,但&#34; yss&#34;会。
  • 搜索单词并不工作&#34;快速&#34; ...搜索&#34; theq&#34;什么都不匹配。

1 个答案:

答案 0 :(得分:0)

我相信这解决了我的问题...通过添加字符过滤器,将其添加到分析器然后使用EdgeNgram标记器......不知道这是否是最佳设置,但它似乎有效。

var customerSearchIdxDesc = new CreateIndexDescriptor(Constants.ElasticSearch.CustomerSearchIndexName)
    .Settings(f =>
        f.Analysis(analysis => analysis
                .CharFilters(cf => cf
                    .PatternReplace(Constants.ElasticSearch.FilterNames.RemoveWhitespace, pr => pr
                        .Pattern(" ")
                        .Replacement(string.Empty)
                    )
                )
                .Analyzers(analyzers => analyzers
                    .Custom(Constants.ElasticSearch.AnalyzerNames.DefaultAnalyzer, a => a
                        .Filters("lowercase")
                        .CharFilters(Constants.ElasticSearch.FilterNames.RemoveWhitespace)
                        .Tokenizer(Constants.ElasticSearch.TokenizerNames.DefaultTokenizer)
                    )
                )
                .Tokenizers(tokenizers => tokenizers
                        .EdgeNGram(Constants.ElasticSearch.TokenizerNames.DefaultTokenizer, t => t
                            .MinGram(1)
                            .MaxGram(500)
                        )
                )
        )
    )
    .Mappings(ms => ms.Map<ServiceModel.DtoTypes.Customer.SearchResult>(m => m
        .AutoMap()
        .Properties(p => p
            .String(n => n.Name(c => c.CustomerName).CopyTo(f =>
            {
                return new FieldsDescriptor<string>().Field("search");
            }).Index(FieldIndexOption.Analyzed).Analyzer(Constants.ElasticSearch.AnalyzerNames.DefaultAnalyzer))
            .String(n => n.Name(c => c.ContactName)
                        .CopyTo(fs => fs.Field(Constants.ElasticSearch.CombinedSearchFieldName))
                        .Index(FieldIndexOption.Analyzed)
                        .Analyzer(Constants.ElasticSearch.AnalyzerNames.DefaultAnalyzer))
            .String(n => n.Name(c => c.CustomerName)
                        .CopyTo(fs => fs.Field(Constants.ElasticSearch.CombinedSearchFieldName))
                        .Index(FieldIndexOption.Analyzed)
                        .Analyzer(Constants.ElasticSearch.AnalyzerNames.DefaultAnalyzer))
            .String(n => n.Name(c => c.City)
                        .CopyTo(fs => fs.Field(Constants.ElasticSearch.CombinedSearchFieldName))
                        .Index(FieldIndexOption.Analyzed)
                        .Analyzer(Constants.ElasticSearch.AnalyzerNames.DefaultAnalyzer))
            .String(n => n.Name(c => c.StateAbbreviation)
                        .CopyTo(fs => fs.Field(Constants.ElasticSearch.CombinedSearchFieldName))
                        .Index(FieldIndexOption.Analyzed)
                        .Analyzer(Constants.ElasticSearch.AnalyzerNames.DefaultAnalyzer))
            .String(n => n.Name(c => c.Country)
                        .CopyTo(fs => fs.Field(Constants.ElasticSearch.CombinedSearchFieldName))
                        .Index(FieldIndexOption.Analyzed)
                        .Analyzer(Constants.ElasticSearch.AnalyzerNames.DefaultAnalyzer))
            .String(n => n.Name(c => c.PostalCode)
                        .CopyTo(fs => fs.Field(Constants.ElasticSearch.CombinedSearchFieldName))
                        .Index(FieldIndexOption.Analyzed)
                        .Analyzer(Constants.ElasticSearch.AnalyzerNames.DefaultAnalyzer))
            .String(n => n.Name(Constants.ElasticSearch.CombinedSearchFieldName)
                        .Index(FieldIndexOption.Analyzed)
                        .Analyzer(Constants.ElasticSearch.AnalyzerNames.DefaultAnalyzer))
            )
        )
    );