elasticsearch嵌套文档查询

时间:2015-01-30 21:33:44

标签: elasticsearch elasticsearch-plugin spring-data-elasticsearch

我是弹性搜索的新手,对如何进行过滤,查询和汇总有一些想法,但不知道如何解决以下问题。 我希望能够从下面显示的文档中仅查询公司的最新交付(date和crate_quantity)。我不确定该如何去做。有没有办法使用max aggregation来从每个文档中仅提取最新的交付?

POST /sanfrancisco/devlivery
{
"company1": {
    "delivery": [
        {
            "date": "01/01/2013",
            "crate_quantity": 5
        },
        {
            "date": "01/12/2013",
            "crate_quantity": 3
        },
        {
            "date": "01/24/2013",
            "crate_quantity": 2
        }
    ]
}
}

POST /sanfrancisco/devlivery
{
"company2": {
    "delivery": [
        {
            "date": "01/01/2015",
            "crate_quantity": 14
        },
        {
            "date": "12/31/2014",
            "crate_quantity": 20
        },
        {
            "date": "11/24/2014",
            "crate_quantity": 13
        }
    ]
}
}

1 个答案:

答案 0 :(得分:0)

如果您希望一次为一家公司提供最新的交付,我可能会使用parent/child关系进行设置。我使用company作为父级,delivery作为孩子。

我还添加了一个custom date format,以便按照您预期的方式对日期进行排序。

我设置了这样的索引:

DELETE /test_index

PUT /test_index
{
   "settings": {
      "number_of_shards": 1,
      "number_of_replicas": 0
   },
   "mappings": {
      "company": {
         "properties": {
            "name": {
               "type": "string",
               "index": "not_analyzed"
            }
         }
      },
      "delivery": {
         "_parent": {
            "type": "company"
         },
         "properties": {
            "crate_quantity": {
               "type": "long"
            },
            "date": {
               "type": "date",
               "format": "MM/dd/yyyy"
            }
         }
      }
   }
}

然后使用bulk api

索引文档
PUT /test_index/_bulk
{"index": {"_index":"test_index", "_type":"company", "_id":1}}
{"name":"company1"}
{"index": {"_index":"test_index", "_type":"delivery", "_id":1, "_parent":1}}
{"date": "01/01/2013", "crate_quantity": 5}
{"index": {"_index":"test_index", "_type":"delivery", "_id":2, "_parent":1}}
{"date": "01/12/2013", "crate_quantity": 3}
{"index": {"_index":"test_index", "_type":"delivery", "_id":3, "_parent":1}}
{"date": "01/24/2013",  "crate_quantity": 2}
{"index": {"_index":"test_index", "_type":"company", "_id":2}}
{"name":"company2"}
{"index": {"_index":"test_index", "_type":"delivery", "_id":4, "_parent":2}}
{"date": "01/01/2015", "crate_quantity": 14}
{"index": {"_index":"test_index", "_type":"delivery", "_id":5, "_parent":2}}
{"date": "12/31/2014",  "crate_quantity": 20}
{"index": {"_index":"test_index", "_type":"delivery", "_id":6, "_parent":2}}
{"date": "11/24/2014",  "crate_quantity": 13 }

现在,我可以使用has_parent filter查询特定公司的最新交付,按日期排序,只接受单个结果,如下所示:

POST /test_index/delivery/_search
{
   "size": 1,
   "sort": [
      {
         "date": {
            "order": "desc"
         }
      }
   ],
   "filter": {
      "has_parent": {
         "type": "company",
         "query": {
            "term": {
               "name": {
                  "value": "company1"
               }
            }
         }
      }
   }
}
...
{
   "took": 2,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 3,
      "max_score": null,
      "hits": [
         {
            "_index": "test_index",
            "_type": "delivery",
            "_id": "3",
            "_score": null,
            "_source": {
               "date": "01/24/2013",
               "crate_quantity": 2
            },
            "sort": [
               1358985600000
            ]
         }
      ]
   }
}

以下是我在尝试此操作时使用的代码:

http://sense.qbox.io/gist/c519b0654448c8b7b0c7c85d613f1e88c0ad1d19