在elasticsearch中获取每组的前n个值

时间:2015-10-23 15:16:10

标签: sorting elasticsearch sum aggregation

我需要获得前n位用户,因为他们在不同日期使用elasticsearch进行了数字字段的总和。

例如,对于以下文档获得前2:

doc1 -> user_id: 1, name: hasan, numeric_field: 2, date_calculated: 03-10-2015
doc2 -> user_id: 2, name: veli, numeric_field: 3, date_calculated: 03-10-2015
doc3 -> user_id: 3, name: osman, numeric_field: 1, date_calculated: 03-10-2015
doc4 -> user_id: 1, name: hasan, numeric_field: 3, date_calculated: 04-10-2015
doc5 -> user_id: 2, name: veli, numeric_field: 5, date_calculated: 04-10-2015
doc6 -> user_id: 3, name: osman, numeric_field: 7, date_calculated: 04-10-2015
doc7 -> user_id: 1, name: hasan, numeric_field: 5, date_calculated: 05-10-2015
doc8 -> user_id: 2, name: veli, numeric_field: 8, date_calculated: 05-10-2015
doc9 -> user_id: 3, name: osman, numeric_field: 9, date_calculated: 05-10-2015

按用户分组的numeric_field的总和=> hasan : 10, veli : 16, osman : 17

对于这个例子,我需要得到结果为前2 - > { osman : 17, veli : 16 }

我应该为此提供什么类型的查询?

2 个答案:

答案 0 :(得分:3)

@ ChintanShah25的回答和@Val的评论帮了很多忙。完整的解决方案如下。请注意,shard_size很重要;如果你不把它设为'0',你可能会看到错误的结果。

{
  "size": 0,
  "aggs": {
    "user_agg": {
      "terms": {
        "field": "name",
        "shard_size": 0, 
        "size": 2,
        "order": {
              "sum_agg": "desc"
        }
      },
      "aggs": {
        "sum_agg": {
          "sum": {
            "field": "numeric_field"
          }
        }
      }
    }
  }
}

答案 1 :(得分:1)

您需要使用ElasticSearch Aggregations。我使用以下查询

{
  "size": 0,
  "aggs": {
    "user_agg": {
      "terms": {
        "field": "name"
      },
      "aggs": {
        "sum_agg": {
          "sum": {
            "field": "numeric_field"
          }
        }
      }
    }
  }
}

这是我得到的结果

"buckets": [
            {
               "key": "hasan",
               "doc_count": 3,
               "sum_agg": {
                  "value": 10
               }
            },
            {
               "key": "osman",
               "doc_count": 3,
               "sum_agg": {
                  "value": 17
               }
            },
            {
               "key": "veli",
               "doc_count": 3,
               "sum_agg": {
                  "value": 16
               }
            }
         ]

我无法获得前n个结果。我尝试在sum聚合中使用Top hits聚合,但事实证明,sum聚合不支持子聚合。

您可以尝试对sum_agg值进行排序。您可以在此处阅读有关聚合的更多信息。 https://www.elastic.co/guide/en/elasticsearch/reference/master/search-aggregations.html

我希望这有帮助!