Question

我正在使用针对Elasticsearch的Nest客户端。我正在使用n-gram索引分析器。我注意到一些奇怪的行为 - 当我从头开始搜索单词时，我没有得到任何结果。但是，如果我从第二个字符开始搜索，它就能完美运行。这些只是普通的英文字母。

因此，例如，它会找到包含＆＃39; kitty＆＃39;如果我搜索＆＃39; itty＆＃39; itt＆＃39;＆＃39; tty＆＃39;等，但不是＆＃39; ki＆＃39;，＆＃39; kit＆＃ 39;等等。它几乎像n-gram只是跳过第一个字符。

我不确定这是否是由Nest引起的，或者这是否是n-gram的正常行为。我的索引设置与此帖中的设置类似：Elasticsearch using NEST: How to configure analyzers to find partial words?，但我的max-gram只有10。

更新

我简化了我的代码并验证了相同的行为。

以下是使用Nest定义的映射配置：

const string index = "myApp";
const string type = "account";
const string indexAnalyzer = "custom_ngram_analyser";
const string searchAnalyzer = "standard";
const string tokenizer = "custom_ngram_tokenizer";
const string tokenFilter = "custom_ngram_tokenFilter";
...
client.CreateIndex(index, i => i
        .Analysis(ad => ad
            .Analyzers(a => a.Add(indexAnalyzer, new CustomAnalyzer() { Tokenizer = tokenizer }))
            .Tokenizers(t => t.Add(tokenizer, new NGramTokenizer() { MinGram = 1, MaxGram = 15 }))
            .TokenFilters(f => f.Add(tokenFilter, new NgramTokenFilter() { MinGram = 1, MaxGram = 15 })))
        .TypeName(account);
        .IdField(r => r.SetPath("accountId").SetIndex("not_analyzed").SetStored(true));
        .Properties(ps => ps.Number(p => p.Name(r => r.AccountId)
                                          .Index(NonStringIndexOption.not_analyzed)
                                          .Store(true));
                            .String(p => p.Name(r => r.AccountName)
                                          .Index(FieldIndexOption.analyzed)
                                          .IndexAnalyzer(indexAnalyzer)
                                          .SearchAnalyzer(searchAnalyzer)
                                          .Store(true)
                                          .TermVector(TermVectorOption.no))));

这是缺少第一个角色的搜索：

SearchCriteria criteria = new SearchCriteria() { AccountName = "kitty" };

client.Search<SearchAccountResult>(s => s
    .Index(index)
    .Type(type)
    .Query(q => q.Bool(b => b.Must(d => d.Match(m => m.OnField(r => r.AccountName).QueryString(criteria.AccountName)))))
    .SortDescending("_score"))

Answer 1

我遇到了这个问题，因为最初我的索引区分大小写。我的所有测试数据都以大写字母开头。

我将其更改为不区分大小写，但更新未立即发生。即使分析器似乎配置为不区分大小写，索引也不会刷新。

擦除索引并从头开始重新填充它解决了这个问题。

Elasticsearch - Nest - 缺少第一个角色

1 个答案: