删除重复项和排序(aggs + sort)

时间:2016-01-28 15:01:12

标签: elasticsearch


SELECT country, listagg(exception_date, ' ,') WITHIN GROUP (ORDER BY country) country
    (select unique te.exception_date, 'GB' country
      from tt_exception te
      where trunc(te.exception_date) > '01-JAN-2015'
      and te.plant = 'W'
      and te.country is null
      order by te.country)
    group by country


"query": {..},
"sort": {.. "body.make": "asc" ..}


  // Here I'm collecting all body.vin values to remove duplicates 
  // and then returning only the first in each result set.
  "aggs": {
    "dedup": {
      "terms": {
        "size": 8,
        "field": "body.vin"
      "aggs": {
        "dedup_docs": {
          "top_hits": {
            "size": 1,
            "_source": false


此外,我已经玩弄了基于查询排序调整分数的想法或解决方案,这样,聚合将根据分数返回正确的顺序,但似乎并没有无论如何要使用// here again same thing, however I attempt to sort on body.make // in the document, however I now realize that my bucket result // being each a collection of the duplicates, will sort each duplicate // and not on the last results. "aggs": { "dedup": { "terms": { "size": 8, "field": "body.vin" }, "aggs": { "order": { "terms": { "field": "body.make", "order": { "_term": "asc" } }, "aggs": { "dedup_docs": { "top_hits": { "size": 1, "_source": false } } } } } } },


1 个答案:

答案 0 :(得分:0)


试图解释它让我意识到一旦我掌握了桶的概念就可以做到这一点,或者更多的是如何通过它们。我仍然会对sort + score调整解决方案感兴趣,但通过聚合可以实现:

// here we first aggregate all body.make, so first results might
// {"toyota": {body.vin 123}, "toyota": {body.vin 123}...} and the
// next result passed into the dedup aggregate would be say
// {"nissan"...
  "aggs": {
    "sort": {
      "terms": {
        "size": 8,
        "field": "body.make",
        "order": {
          "_term": "desc"
      "aggs": {
        "dedup": {
          "terms": {
            "size": 8,
            "field": "body.vin"
          "aggs": {
            "dedup_docs": {
              "top_hits": {
                "size": 1,
                "_source": false