嵌套子聚合返回“错误”父级doc_count

时间:2016-01-04 10:17:07

标签: elasticsearch

我已经尝试了很长一段时间但却看不到问题。如果在ES方面有更多经验的人能够向我提出正确的方向,我将非常高兴。我有一个父类型(学院)和一个孩子类型(课程)。课程有3级嵌套聚合(Subjectgroup)。现在,我试图展示有多少学院提供课程与学院查询中的个别学科组。

这是我的映射:

indexes:
    studiengaenge:
        index_name: studiengaenge_dev
        settings:
            index:
                analysis:
                    analyzer:
                        lc_term:
                            type: custom
                            tokenizer: keyword
                            filter: lowercase
        types:
            college:
                mappings:
                    id: ~
            course:
                mappings:
                    id: ~
                    name: ~
                    subjectgroups:
                        type: "nested"
                        properties:
                            name: { "type": "string", "index": "analyzed", "analyzer": "lc_term" }
                            area:
                                type: "nested"
                                properties:
                                    name: { "type": "string", "index": "analyzed", "analyzer": "lc_term" }
                                    field:
                                        type: "nested"
                                        properties:
                                            name: { "type": "string", "index": "analyzed", "analyzer": "lc_term" }
                _parent:
                    type: "college"

查询:

GET college/_search?search_type=count
{
  "query": {
    "has_child": {
      "type": "course",
      "query": {
        "filtered": {
          "query": {
            "match_all": {}
          },
          "filter": {
            "bool": {
              "must": [
                {
                  "nested": {
                    "path": "subjectgroups",
                    "filter": {
                      "terms": {
                        "subjectgroups.name": [
                          "lehramt"
                        ]
                      }
                    }
                  }
                }
              ]
            }
          }
        }
      }
    }
  },
  "aggs": {
    "children": {
      "children": {
        "type": "course"
      },
      "aggs": {
        "fachgruppen": {
          "nested": {
            "path": "course.subjectgroups"
          },
          "aggs": {
            "filtered": {
              "filter": {
                "terms": {
                  "subjectgroups.name": [
                    "lehramt"
                  ]
                }
              },
              "aggs": {
                "fachgruppe": {
                  "terms": {
                    "field": "subjectgroups.name"
                  },
                  "aggs": {
                    "reverse_nested": {
                      "reverse_nested": {},
                      "aggs": {
                        "doc_count_college": {
                          "cardinality": {
                            "field": "_parent"
                          }
                        }
                      }
                    },
                    "studienbereich": {
                      "nested": {
                        "path": "course.subjectgroups.area"
                      },
                      "aggs": {
                        "studienbereich": {
                          "terms": {
                            "field": "subjectgroups.area.name"
                          },
                          "aggs": {
                            "reverse_nested": {
                              "reverse_nested": {},
                              "aggs": {
                                "doc_count_college": {
                                  "cardinality": {
                                    "field": "_parent"
                                  }
                                }
                              }
                            }
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

结果:

{
   "took": 7,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 123,
      "max_score": 0,
      "hits": []
   },
   "aggregations": {
      "children": {
         "doc_count": 12289,
         "fachgruppen": {
            "doc_count": 15029,
            "filtered": {
               "doc_count": 4582,
               "fachgruppe": {
                  "doc_count_error_upper_bound": 0,
                  "sum_other_doc_count": 0,
                  "buckets": [
                     {
                        "key": "lehramt",
                        "doc_count": 4582,
                        "reverse_nested": {
                           "doc_count": 3786,
                           "doc_count_college": {
                              "value": 124
                           }
                        },
                        "studienbereich": {
                           "doc_count": 4582,
                           "studienbereich": {
                              "doc_count_error_upper_bound": 0,
                              "sum_other_doc_count": 0,
                              "buckets": [
                                 {
                                    "key": "schulische fächer",
                                    "doc_count": 3938,
                                    "reverse_nested": {
                                       "doc_count": 3399,
                                       "doc_count_college": {
                                          "value": 130
                                       }
                                    }
                                 },
                                 {
                                    "key": "berufliche fachrichtungen",
                                    "doc_count": 357,
                                    "reverse_nested": {
                                       "doc_count": 315,
                                       "doc_count_college": {
                                          "value": 105
                                       }
                                    }
                                 },
                                 {
                                    "key": "sonderpädagogik, inklusive pädagogik",
                                    "doc_count": 287,
                                    "reverse_nested": {
                                       "doc_count": 287,
                                       "doc_count_college": {
                                          "value": 32
                                       }
                                    }
                                 }
                              ]
                           }
                        }
                     }
                  ]
               }
            }
         }
      }
   }
}

问题在于,即使只有123个结果(大学),第二级学科组的聚合告诉我,有130个学院“关键”:“schulischefächer”。 任何帮助是极大的赞赏。谢谢,Hannes

1 个答案:

答案 0 :(得分:0)

在这种情况下的问题是弹性搜索在聚合计数上不一定是精确的。如果它认为聚合的结果不会受到此特定分片可能返回的结果的极大影响,则Elasticsearch可能会完全忽略分片。

这将仅导致聚合的近似结果。尝试以一种只有一个数据分片的方式设置索引参数。然后结果可能会有所不同,也可能是准确的。然而,这不是一个解决方案,因为如果您有大量数据,则需要将其分布在不同的分片上。

当涉及聚合和其他操作时,弹性搜索的这种属性是无可比拟的速度的缺点。

相关问题