ElasticSearch NEST聚合

时间:2017-12-22 09:29:50

标签: c# elasticsearch nest

我做了很多googeling并检查了NEST和ElasticSearch的文档,但我找不到一个有效的例子/解决我的问题。

我做了一个例子。在这个例子中,我想查询不同的Last_Names的数量和每个家庭的工资的总和。

    class Employee
    {
        public string First_Name { get; set; }
        public string Last_Name { get; set; }
        public int Salary { get; set; }

        public Employee(string first_name, string last_name, int salary)
        {
            this.First_Name = first_name;
            this.Last_Name = last_name;
            this.Salary = salary;
        }
        public Employee() { }
    }
    private void button4_Click(object sender, EventArgs e)
    {
        // Create 4 employees
        Employee al = new Employee("Al", "Bundy", 1500);
        Employee bud = new Employee("Bud", "Bundy", 975);
        Employee marcy = new Employee("Marcy", "Darcy", 4500);
        Employee jefferson = new Employee("Jefferson", "Darcy", 0);

        // add the 4 employees to the index
        client.Index<Employee>(al);
        client.Index<Employee>(bud);
        client.Index<Employee>(marcy);
        client.Index<Employee>(jefferson);

        // query the index
        var result = client.Search<Employee>(s => s
            .Aggregations(a => a
                .Terms("Families", ts => ts
                    .Field(o => o.Last_Name)
                    .Size(10)
                    .Aggregations(aa => aa
                        .Sum("FamilySalary", sa => sa
                            .Field(o => o.Salary)
                        )
                    )
                )
            )
        );

        // Get the number of different families (Result should be 2: Bundy and Darcy)  
        // and get the family-salary of family Bundy and the family-salary for the Darcys
        var names = result.Aggs.Terms("Families");
        // ?? var x = names.Sum("Bundy");           
    }

我需要弹性的以下信息:
*指数中有两个不同的家庭   *家庭邦迪收入2475
  *达西家庭收入4500元

请帮助

1 个答案:

答案 0 :(得分:2)

这是一个完整的例子

private static void Main()
{
    var defaultIndex = "employees";

    var settings = new ConnectionSettings(new Uri("http://localhost:9200"))
        .InferMappingFor<Employee>(i => i
            .IndexName(defaultIndex)
        )
        .DefaultIndex(defaultIndex)
        // following settings are useful while developing
        // but probably don't want to use them in production
        .DisableDirectStreaming()
        .PrettyJson()
        .OnRequestCompleted(callDetails =>
        {
            if (callDetails.RequestBodyInBytes != null)
            {
                Console.WriteLine(
                    $"{callDetails.HttpMethod} {callDetails.Uri} \n" +
                    $"{Encoding.UTF8.GetString(callDetails.RequestBodyInBytes)}");
            }
            else
            {
                Console.WriteLine($"{callDetails.HttpMethod} {callDetails.Uri}");
            }

            Console.WriteLine();

            if (callDetails.ResponseBodyInBytes != null)
            {
                Console.WriteLine($"Status: {callDetails.HttpStatusCode}\n" +
                         $"{Encoding.UTF8.GetString(callDetails.ResponseBodyInBytes)}\n" +
                         $"{new string('-', 30)}\n");
            }
            else
            {
                Console.WriteLine($"Status: {callDetails.HttpStatusCode}\n" +
                         $"{new string('-', 30)}\n");
            }
        });

    var client = new ElasticClient(settings);

    if (client.IndexExists(defaultIndex).Exists)
        client.DeleteIndex(defaultIndex);

    client.CreateIndex(defaultIndex, c => c
        .Settings(s => s
            .NumberOfShards(1)
        )
        .Mappings(m => m
            .Map<Employee>(mm => mm
                .AutoMap()
            )
        )
    );

    // Create 4 employees
    var al = new Employee("Al", "Bundy", 1500);
    var bud = new Employee("Bud", "Bundy", 975);
    var marcy = new Employee("Marcy", "Darcy", 4500);
    var jefferson = new Employee("Jefferson", "Darcy", 0);

    client.IndexMany(new [] { al, bud, marcy, jefferson });

    // refresh the index after indexing. We do this here for example purposes,
    // but in a production system, it's preferable to use the refresh interval
    // see https://www.elastic.co/blog/refreshing_news
    client.Refresh(defaultIndex);

    // query the index
    var result = client.Search<Employee>(s => s
        .Aggregations(a => a
            .Terms("Families", ts => ts
                .Field(o => o.Last_Name.Suffix("keyword")) // use the keyword sub-field for terms aggregation
                .Size(10)
                .Aggregations(aa => aa
                    .Sum("FamilySalary", sa => sa
                        .Field(o => o.Salary)
                    )
                )
            )
        )
    );

    // Get the number of different families (Result should be 2: Bundy and Darcy)  
    // and get the family-salary of family Bundy and the family-salary for the Darcys
    var names = result.Aggs.Terms("Families");

    foreach(var name in names.Buckets)
    {
        var sum = name.Sum("FamilySalary");
        Console.WriteLine($"* family {name.Key} earns {sum.Value}");
    }
}

public class Employee
{
    public string First_Name { get; set; }
    public string Last_Name { get; set; }
    public int Salary { get; set; }

    public Employee(string first_name, string last_name, int salary)
    {
        this.First_Name = first_name;
        this.Last_Name = last_name;
        this.Salary = salary;
    }
    public Employee() { }
}

此输出

  
      
  • 家庭邦迪收入2475
  •   
  • 家庭Darcy获得4500
  •   

几点:

  1. 我已使用automapping显式创建了包含员工映射的索引。虽然该示例将在没有索引或显式映射的情况下工作,但为了清楚起见,我添加了它,以便您可以看到在控制台输出中创建的内容。您可以更改员工的映射方式以满足您的需求。
  2. 批量索引一个请求中的所有文档
  3. 批量索引后刷新索引。在生产系统中,通常在索引操作之后不调用refresh,因为它会导致Lucene段写入底层的倒排索引。虽然有一个后台进程来合并段,having many small segments can be a problem.最好让刷新间隔做到这一点。此处调用Refresh仅用于在索引后使文档可用于搜索。
  4. 在映射string属性时,聚合术语应在使用自动化创建的keyword sub field上运行。关键字字段数据类型索引值 verbatim 并利用doc值,这是一种适用于聚合和排序的柱状数据结构。
  5. 术语聚合包含一组存储桶,其中每个存储桶键都是术语。每个桶可能有子聚合;在这种情况下是一个总和聚合。
相关问题