如何子聚合数组值?

时间:2019-05-29 16:56:55

标签: elasticsearch elasticsearch-aggregation

我有一个数据(其中一个字段内有数组),格式如下:

 [
    {
        "key1": "val1",
        "key2": "val2",
        "options": {
          "age": 15,
          "gender": "male",
          "questions": [
            {
              "question": "Fauvorite color",
              "answer": "White"
            },
            {
              "question": "Fauvorite flower",
              "answer": "Tulip"
            }
          ]
        }
      },
      {
        "key1": "val3",
        "key2": "val4",
        "options": {
          "age": 25,
          "gender": "female",
          "questions": [
            {
              "question": "Fauvotire color",
              "answer": "White"
            },
            {
              "question": "Fauvotire flower",
              "answer": "Daisies"
            }
          ]
        }
      }
    ]

我做了这样的汇总:

{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        // Some conditions here
      ]
    }
  },
  "aggs": {
    "appMetricsAggregation": {
      "date_histogram": {
        "field": "@timestamp",
        "interval": "1d",
        "time_zone": "UTC",
        "min_doc_count": 1
      },
      "aggs": {
        "genderAgg": {
          "filters": {
            "filters": {
              "male": {
                "query_string": {
                  "query": "options.gender:male"
                }
              },
              "female": {
                "query_string": {
                  "query": "options.gender:female"
                }
              }
            }
          },
          "aggs": {
            "ageAgg": {
              "range": {
                "field": "options.age",
                "ranges": [
                  { "key": "child", "from": 0, "to": 10 },
                  { "key": "teenager", "from": 10, "to": 20 },
                  { "key": "young", "from": 20, "to": 30 },
                  { "key": "adult", "from": 30, "to": 40 },
                  { "key": "middle", "from": 40, "to": 50 },
                  { "key": "senior", "from": 50, "to": 60 },
                  { "key": "old", "from": 60 }
                ],
                "keyed": true
              },
              "aggs": {
                "appQuestions": {
                  "terms": {
                    "field": "options.questions.question.keyword"
                  },
                  "aggs": {
                    "appAnswers": {
                      "terms": {
                        "field": "options.questions.answer.keyword"
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

我希望输出类似于以下内容(在响应后过滤):​​

{
  "appMetricsAggregation": {
    "buckets": [
      {
        "key": "timestamp1",
        "genderAgg": {
          "male": {
            "ageAgg": {
              "child": {
                "appQuestions": {
                  "buckets": [
                    {
                      "key": "Fauvorite color",
                      "answers": {
                        "buckets": [
                          {
                            "key": "white",
                            "doc_count": 2
                          }
                        ]
                      }
                    },
                    {
                      "key": "Fauvorite flower",
                      "answers": {
                        "buckets": [
                          {
                            "key": "Tulip",
                            "doc_count": 1
                          },
                          {
                            "key": "Daisies",
                            "doc_count": 1
                          }
                        ]
                      }
                    }
                  ]
                }
              },
              "teenager": ///
            }
          },
          "female": ///
        }
      }, {
        ///...
      }
    ]
  }
}

因此,基本上,我希望阵列中每个问题的答案分布均匀。 但问题是我收到了每个问题的所有答案,因此它们彼此无关。甚至doc_count是正确的,但是子聚合比预期的要大。

{
  "key": "Fauvorite flowers",
  "doc_count": 2,
  "answers": {
    "buckets": [
      {
        "key": "white",
        "doc_count": 2
      },
      {
        "key": "daisies",
        "doc_count": 1
      },
      {
        "key": "green",
        "doc_count": 1
      },
      {
        "key": "tulips",
        "doc_count": 1
      }
    ]
  }
}

有没有办法将问题和答案链接在一起?进行某种方式的映射(我已经阅读了一些有关映射的内容,但是它不起作用,可能使用了错误的方式),建立了亲子关系或类似的东西?

0 个答案:

没有答案