ElasticSearch - 过滤和聚合嵌套文档

时间:2015-11-30 13:05:25

标签: elasticsearch aggregation

我需要应用以下数据处理:

  1. 使用过滤器或查询获取根文档的子集
  2. 对于此子集过滤器嵌套文档(如果根文档没有匹配的嵌套文档,则不应从结果中排除)
  3. 获取每个根文档的匹配嵌套文档的数量,以便可以将其用于进一步聚合
  4. 按特定字段对根文档进行分组,并获取每个组的嵌套文档总数。
  5. 我想我理解如何实现第1,2和4项,但我还没有找到如何获取嵌套文档的数量,以便它可以用于进一步聚合。这是使用的映射:

    {
        "store": {
            "properties": {
                "id": {
                    "type": "long"
                },
                "categoryIds": {
                    "type": "long"
                },
                "campaigns": {
                    "type": "nested",
                    "properties": {
                        "id": {
                            "type": "long"
                        },
                        "startDate": {
                            "type": "date"
                        },
                        "endDate": {
                            "type": "date"
                        }
                    }
                }
            }
        }
    }
    

    以下是示例实体:

    {"id":9,"categoryIds":[3,4],"campaigns":[{"id":9,"startDate":"20151130","endDate":"20151202"},{"id":10,"startDate":"20151130","endDate":"20151202"},{"id":11,"startDate":"20151129","endDate":"20151130"},{"id":12,"startDate":"20151130","endDate":"20151202"}]}
    {"id":10,"categoryIds":[3],"campaigns":[{"id":10,"startDate":"20151130","endDate":"20151202"},{"id":11,"startDate":"20151129","endDate":"20151130"},{"id":12,"startDate":"20151130","endDate":"20151202"}]}
    {"id":11,"categoryIds":[4],"campaigns":[]}
    {"id":12,"categoryIds":[5],"campaigns":[]}
    

    例如,我有嵌套实体的过滤器,它将嵌套文档(活动)与ids 9,10和12匹配。过滤后我需要为每个文档获取嵌套文档计数。对于id = 9的root doc(store)计数将为3,对于id = 10 count = 2,对于id 11和12 count将为0.然后我需要按categoryIds字段对根文档进行分组并获得先前计算的总和每组的值。对于key = 3 sum = 5(第1个商店3个,第2个商店2个),key = 4 sum = 3(来自第1个商店),key = 5 sum = 0。

0 个答案:

没有答案