Elasticsearch NEST:不带通配符的部分/全文搜索

时间:2016-10-03 13:18:05

标签: c# .net elasticsearch nest

我想为我的用户提供搜索引擎。让我们说用户类是:

public class User
{
    public string Code { get; set; }
    public string Name { get; set; }
}

我的数据库中有这样的用户:

(1) new User { Code = "XW1234", Name = "John Doe" }, 
(2) new User { Code = "AD4567", Name = "Jane Doe" }

所以: 当我的查询是:" doe" (注意小写)我想看(1)和(2) 当我的查询是:" 4"我想看(1)和(2) 当我的查询是:" x"我想看(1) 当我的查询是:" ja"我想看(2) 我希望在SQL中与like %doe%类似地工作。请不要介意查询长度 - 我最少使用3个字母。这只是一个例子。

我有一个带通配符的解决方案 - 工作但性能很差。

我试图配置索引以使用ngram tokenizer但没有成功 - 我收到了一个空集合。

我也检查了这个("以"方法开始): https://www.elastic.co/guide/en/elasticsearch/guide/current/_index_time_search_as_you_type.html 没有成功。

请提供C#代码。我不知道我是否正确翻译了弹性搜索jsons。

修改 根据第一条评论我尝试了这个:

private const string DefaultIndexName = "test";
private const string ElasticSearchServerUri = @"http://192.168.99.100:32769";

private static readonly IndexName UsersIndexName = "users";

public IElasticClient CreateElasticClient()
{
    var settings = CreateConnectionSettings();

    var client = new ElasticClient(settings);

    var studentsIndexDescriptor = new CreateIndexDescriptor(UsersIndexName)
        .Mappings(ms => ms
            .Map<User>(m => m
                .Properties(ps => ps
                    .String(s => s
                        .Name(n => n.Code)
                        .Analyzer("substring_analyzer")))));
    client.CreateIndex(UsersIndexName, descriptor => studentsIndexDescriptor
        .Settings(s => s
            .Analysis(a => a
                .Analyzers(analyzer => analyzer
                    .Custom("substring_analyzer", analyzerDescriptor => analyzerDescriptor.Tokenizer("standard").Filters("lowercase", "substring")))
                .TokenFilters(tf => tf
                    .NGram("substring", filterDescriptor => filterDescriptor.MinGram(2).MaxGram(15))))));

    return client;
}

private static ConnectionSettings CreateConnectionSettings()
{
    var uri = new Uri(ElasticSearchServerUri);
    var settings = new ConnectionSettings(uri);
    settings
        .DefaultIndex(DefaultIndexName);

    return settings;
}

我使用了这个查询:

public IEnumerable<User> Search(string query)
{
    var result = elasticClient.Search<User>(descriptor => descriptor
        .Query(q => q
            .QueryString(queryDescriptor => queryDescriptor.Query(query).Fields(fs => fs.Fields(f1 => f1.Code)))));
    return result.Documents;
}

没有工作。

我尝试过代码:&#34; 1234&#34;和&#34; 5678&#34;。我尝试用&#34; 23&#34;,&#34; 5&#34; - 没有结果。 当我搜索&#34; 1234&#34; - 它返回正确的用户。

1 个答案:

答案 0 :(得分:4)

我怀疑你的代码:

  1. 为用户编制索引时,不指定users索引,因此将用户编入索引为默认索引。
  2. 搜索时,不指定users索引,因此将查询默认索引test。此索引包含索引文档,但code字段未使用substring_analyzer进行分析,因为此分析在users索引中定义。
  3. NEST在ConnectionSettings.InferMappingFor<T>()上提供配置选项,以将特定POCO类型与特定索引名称相关联;如果未在请求中指定并且优先于默认索引,则将使用此索引。

    var uri = new Uri(ElasticSearchServerUri);
    var settings = new ConnectionSettings(uri)
        .DefaultIndex(DefaultIndexName)
        .InferMappingFor<User>(d => d
            .IndexName(UsersIndexName)
        );
    

    您的其余代码是正确的。这是一个完整的工作示例

    private const string DefaultIndexName = "test";
    private const string ElasticSearchServerUri = @"http://localhost:9200";
    private const string UsersIndexName = "users";
    
    void Main()
    {
        var client = CreateElasticClient();
    
        var users = new[] {
            new User { Code = "XW1234", Name = "John Doe" },
            new User { Code = "AD4567", Name = "Jane Doe" }
        };
    
        client.IndexMany(users);
    
        // refresh the index after indexing so that the documents are immediately 
        // available for search. This is good for testing, 
        // but avoid doing it in production.
        client.Refresh(UsersIndexName);
    
        var result = client.Search<User>(descriptor => descriptor
            .Query(q => q
                .QueryString(queryDescriptor => queryDescriptor
                    .Query("1234")
                    .Fields(fs => fs
                        .Fields(f1 => f1.Code)
                    )
                )
            )
        );
    
        // outputs 1
        Console.WriteLine(result.Total);
    }
    
    public class User
    {
        public string Code { get; set; }
        public string Name { get; set; }
    }
    
    public IElasticClient CreateElasticClient()
    {
        var settings = CreateConnectionSettings();
        var client = new ElasticClient(settings);
    
        // this is here so that the example can be re-run.
        // Remove this!
        if (client.IndexExists(UsersIndexName).Exists)
        {
            client.DeleteIndex(UsersIndexName);
        }
    
        client.CreateIndex(UsersIndexName, descriptor => descriptor
            .Mappings(ms => ms
                .Map<User>(m => m
                    .AutoMap()
                    .Properties(ps => ps
                        .String(s => s
                            .Name(n => n.Code)
                            .Analyzer("substring_analyzer")
                        )
                    )
                )
            )
            .Settings(s => s
                .Analysis(a => a
                    .Analyzers(analyzer => analyzer
                        .Custom("substring_analyzer", analyzerDescriptor => analyzerDescriptor
                            .Tokenizer("standard")
                            .Filters("lowercase", "substring")
                        )
                    )
                    .TokenFilters(tf => tf
                        .NGram("substring", filterDescriptor => filterDescriptor
                            .MinGram(2)
                            .MaxGram(15)
                        )
                    )
                )
            )
        );
    
        return client;
    }
    
    private static ConnectionSettings CreateConnectionSettings()
    {
        var uri = new Uri(ElasticSearchServerUri);
        var settings = new ConnectionSettings(uri)
            .DefaultIndex(DefaultIndexName)
            .InferMappingFor<User>(d => d
                .IndexName(UsersIndexName)
            );
    
        return settings;
    }