我目前有一个ES查询,它使用geohash_grid
和date_histogram
为我提供了“地理信息桶”列表:
"aggregations": {
"zoomedInView": {
"filter": {
"geo_bounding_box": {
"location": {
"top_left": "-37, 140",
"bottom_right": "-38, 146"
}
}
},
"aggregations": {
"zoom1": {
"geohash_grid": {
"field": "location",
"precision": 6
},
"aggs": {
"ts": {
"date_histogram": {
"min_doc_count" : 1,
"field": "dateTime",
"interval": "1m",
"format": "DDD HH:mm"
}
},
"map_zoom": {
"geo_bounds": {
"field": "location"
}
}
}
}
}
}
这给我的结果看起来像:
{
"key": "r1r0fu",
"map_zoom": {
"bounds": {
"top_left": {
"lat": -38.81073913909495,
"lon": 124.96536672115326
},
"bottom_right": {
"lat": -38.81329075805843,
"lon": 124.96823584660888
}
}
},
"ts": {
"buckets": [
{
"key_as_string": "136 20:15",
"key": 1463354100000,
},
{
"key_as_string": "137 04:30",
"key": 1463365800000,
"doc_count": 1
},
....
{
"key": "r1r0gx",
"map_zoom": {
"bounds": {
"top_left": {
"lat": -38.798130828887224,
"lon": 124.99871227890253
},
"bottom_right": {
"lat": -38.79820383526385,
"lon": 124.99872468411922
}
}
},
"ts": {
"buckets": [
{
"key_as_string": "136 23:21",
"key": 1463354460000,
},
{
"key_as_string": "137 02:30",
"key": 1463365800000,
},
{
"key_as_string": "137 03:31",
"key": 1463369460000,
}
]
}
},
在上面的例子中,结果按地理桶r1r0fu
和r1r0gx
排序,并在桶内订购事件的有序时间(格式为每年HHH:mm)他们的数量。
我真正喜欢的是:
1)按时间排序的结果,可能意味着相同的桶会出现多次。
2)只有每个桶中显示的最短和最长时间(如果可能)
所以上面的结果理想情况如下:
{
"key": "r1r0fu",
"map_zoom": {
"bounds": {
"top_left": {
"lat": -38.81073913909495,
"lon": 124.96536672115326
},
"bottom_right": {
"lat": -38.81329075805843,
"lon": 124.96823584660888
}
}
},
"ts": {
"buckets": [
{
"key_as_string": "136 20:15",
"key": 1463354100000,
},
]
}
},
{
"key": "r1r0gx",
"map_zoom": {
"bounds": {
"top_left": {
"lat": -38.798130828887224,
"lon": 124.99871227890253
},
"bottom_right": {
"lat": -38.79820383526385,
"lon": 124.99872468411922
}
}
},
"ts": {
"buckets": [
{
"key_as_string": "136 23:21",
"key": 1463354460000,
},
{
"key_as_string": "137 03:31",
"key": 1463369460000,
},
}
},
{
"key": "r1r0fu",
"map_zoom": {
"bounds": {
"top_left": {
"lat": -38.81073913909495,
"lon": 124.96536672115326
},
"bottom_right": {
"lat": -38.81329075805843,
"lon": 124.96823584660888
}
}
},
"ts": {
"buckets": [
{
"key_as_string": "137 04:30",
"key": 1463365800000,
}
]
}
},
...
结果按时间排序,因此在这种情况下,存储桶r1r0fu
会出现两次。事件"key_as_string": "137 02:30",
已被隐藏,因为它不是最短或最长日期。
这有可能吗?
非常感谢!
答案 0 :(得分:1)
如果您希望按时间排序结果,可能最好将date_histogram
聚合与geohash_grid
聚合交换,如下所示:
{
"aggregations": {
"zoomedInView": {
"filter": {
"geo_bounding_box": {
"location": {
"top_left": "-37, 140",
"bottom_right": "-38, 146"
}
}
},
"aggregations": {
"ts": {
"date_histogram": {
"min_doc_count": 1,
"field": "dateTime",
"interval": "1m",
"format": "DDD HH:mm"
},
"aggs": {
"zoom1": {
"geohash_grid": {
"field": "location",
"precision": 6
}
},
"map_zoom": {
"geo_bounds": {
"field": "location"
}
}
}
}
}
}
}
}
那将解决问题1)。但是,由于现在每个主存储桶都是时间存储桶,因此您将无法再拥有最小和最大时间。试一试,看看它是否适合您的需求。