删除重复项和排序(aggs + sort)

时间:2016-01-28 15:01:12

标签: elasticsearch

我试图找到查询返回排序集的最佳解决方案,然后我使用aggs删除重复项,这很好用,但是当我在查询结果上添加排序时,例如

SELECT country, listagg(exception_date, ' ,') WITHIN GROUP (ORDER BY country) country
    FROM 
    (select unique te.exception_date, 'GB' country
      from tt_exception te
      where trunc(te.exception_date) > '01-JAN-2015'
      and te.plant = 'W'
      and te.country is null
      order by te.country)
    group by country

我希望aggs也按顺序返回结果,但似乎总是在查询分数上排序。

"query": {..},
"sort": {.. "body.make": "asc" ..}

我试图在两者之间加上一个术语聚合,看看是否可以排序:

  // Here I'm collecting all body.vin values to remove duplicates 
  // and then returning only the first in each result set.
  "aggs": {
    "dedup": {
      "terms": {
        "size": 8,
        "field": "body.vin"
      },
      "aggs": {
        "dedup_docs": {
          "top_hits": {
            "size": 1,
            "_source": false
          }
        }
      }
    }
  },

但聚合的结果总是基于得分。

此外,我已经玩弄了基于查询排序调整分数的想法或解决方案,这样,聚合将根据分数返回正确的顺序,但似乎并没有无论如何要使用// here again same thing, however I attempt to sort on body.make // in the document, however I now realize that my bucket result // being each a collection of the duplicates, will sort each duplicate // and not on the last results. "aggs": { "dedup": { "terms": { "size": 8, "field": "body.vin" }, "aggs": { "order": { "terms": { "field": "body.make", "order": { "_term": "asc" } }, "aggs": { "dedup_docs": { "top_hits": { "size": 1, "_source": false } } } } } } },

如果有人在排序结果方面取得了成功,同时删除重复项或想法/建议,请告知我们。

1 个答案:

答案 0 :(得分:0)

这不是最理想的解决方案,因为它只允许在一个字段上进行排序。最好的方法是改变排序结果的得分/提升

试图解释它让我意识到一旦我掌握了桶的概念就可以做到这一点,或者更多的是如何通过它们。我仍然会对sort + score调整解决方案感兴趣,但通过聚合可以实现:

// here we first aggregate all body.make, so first results might
// {"toyota": {body.vin 123}, "toyota": {body.vin 123}...} and the
// next result passed into the dedup aggregate would be say
// {"nissan"...
  "aggs": {
    "sort": {
      "terms": {
        "size": 8,
        "field": "body.make",
        "order": {
          "_term": "desc"
        }
      },
      "aggs": {
        "dedup": {
          "terms": {
            "size": 8,
            "field": "body.vin"
          },
          "aggs": {
            "dedup_docs": {
              "top_hits": {
                "size": 1,
                "_source": false
              }
            }
          }
        }
      }
    }
  },