范围ElasticSearch聚合

时间:2018-06-19 21:57:29

标签: elasticsearch elasticsearch-aggregation

我需要在ElasticSearch中计算管道聚合,但我不知道该如何表达。

每个文档都有一个电子邮件地址和一定数量。我需要输出金额计数的范围段,按唯一的电子邮件分组。

{ "0 - 99": 300, "100 - 400": 100 ...}

基本上将是预期的输出(密钥将在我的应用程序代码中转换),表明300份唯一的电子邮件已在所有文档中累计收到至少99(金额)。

直觉上,我希望查询如下。但是,范围似乎不是存储桶聚合(或允许buckets_path)。

这里正确的方法是什么?

{
 aggs: {
   users: {
     terms: {
       field: "email"
     },
     aggs: {
       amount_received: {
         sum: {
           field: "amount"
         }
       }
     }
   },
   amount_ranges: {
     range: {
       buckets_path: "users>amount_received",
       ranges: [
           { to: 99.0 },
           { from: 100.0, to: 299.0 },
           { from: 300.0, to: 599.0 },
           { from: 600.0 }
       ]
     }
   }
}
  }

1 个答案:

答案 0 :(得分:4)

没有直接进行此操作的管道聚合。但是,我想我想出了一个适合您需求的解决方案,它就像这样。想法是重复相同的terms/sum聚合,然后对您感兴趣的每个范围使用bucket_selector管道聚合。

POST index/_search
{
  "size": 0,
  "aggs": {
    "users_99": {
      "terms": {
        "field": "email",
        "size": 1000
      },
      "aggs": {
        "amount_received": {
          "sum": {
            "field": "amount"
          }
        },
        "-99": {
          "bucket_selector": {
            "buckets_path": {
              "amountReceived": "amount_received"
            },
            "script": "params.amountReceived < 100"
          }
        }
      }
    },
    "users_100_299": {
      "terms": {
        "field": "email",
        "size": 1000
      },
      "aggs": {
        "amount_received": {
          "sum": {
            "field": "amount"
          }
        },
        "100-299": {
          "bucket_selector": {
            "buckets_path": {
              "amountReceived": "amount_received"
            },
            "script": "params.amountReceived >= 100 && params.amountReceived < 300"
          }
        }
      }
    },
    "users_300_599": {
      "terms": {
        "field": "email",
        "size": 1000
      },
      "aggs": {
        "amount_received": {
          "sum": {
            "field": "amount"
          }
        },
        "300-599": {
          "bucket_selector": {
            "buckets_path": {
              "amountReceived": "amount_received"
            },
            "script": "params.amountReceived >= 300 && params.amountReceived < 600"
          }
        }
      }
    },
    "users_600": {
      "terms": {
        "field": "email",
        "size": 1000
      },
      "aggs": {
        "amount_received": {
          "sum": {
            "field": "amount"
          }
        },
        "600": {
          "bucket_selector": {
            "buckets_path": {
              "amountReceived": "amount_received"
            },
            "script": "params.amountReceived >= 600"
          }
        }
      }
    }
  }
}

结果中,users_99中的存储桶数将是数量少于99的唯一电子邮件数量。类似地,users_100_299将包含与唯一电子邮件数量一样多的存储桶金额在100到300之间。依此类推...