Question

在Kibana上，我可以查看各种产品的日志（product.name）以及时间戳和其他信息。这是日志之一：

{
  "_index": "xxx-2017.08.30",
  "_type": "logs",
  "_id": "xxxx",
  "_version": 1,
  "_score": null,
  "_source": {
    "v": "1.0",
    "level": "INFO",
    "timestamp": "2017-01-30T18:31:50.761Z",
    "product": {
      "name": "zzz",
      "version": "2.1.0-111"
    },
    "context": {
      ...
      ...
    }
  },
  "fields": {
    "timestamp": [
      1504117910761
    ]
  },
  "sort": [
    1504117910761
  ]
}

同一产品还有其他几个日志，不同产品还有几个日志。

但是，我想编写一个查询，返回给定product.name（具有最大时间戳值的那个）的单个记录，并返回所有其他产品的相同信息。即，每个产品和每个产品返回一个日志，它应该是具有最大时间戳的日志。

我如何实现这一目标？

我试图遵循以下列出的方法： How to get latest values for each group with an Elasticsearch query?

并创建了一个查询：

{
    "aggs": {
        "group": {
            "terms": {
                "field": "product.name"
            },
            "aggs": {
                "group_docs": {
                    "top_hits": {
                        "size": 1,
                        "sort": [
                            {
                                "timestamp": {
                                    "order": "desc"
                                }
                            }
                        ]
                    }
                }
            }
        }
    }
}'

但是，我收到了一个错误：

  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "Fielddata is disabled on text fields by default. Set fielddata=true on [product.name] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."
      }
    ],

在这种情况下，我是否绝对需要为此字段设置fielddata = true？如果不是，我该怎么办？如果是，我不知道如何设置它。我试着这样做：

curl -XGET 'localhost:9200/xxx*/_search?pretty' -H 'Content-Type: application/json' -d'
{
    "properties": {
      "product.name": { 
        "type":     "text",
        "fielddata": true
      }
    },
    "aggs": {
        "group": {
            "terms": {
                "field": "product.name"
            },
            "aggs": {
                "group_docs": {
                    "top_hits": {
                        "size": 1,
                        "sort": [
                            {
                                "timestamp": {
                                    "order": "desc"
                                }
                            }
                        ]
                    }
                }
            }
        }
    }
}'

但是，我觉得它有问题（同义词？）我得到了这个错误：

{
  "error" : {
    "root_cause" : [
      {
        "type" : "parsing_exception",
        "reason" : "Unknown key for a START_OBJECT in [properties].",
        "line" : 3,
        "col" : 19
      }
    ],

Answer 1

您收到错误的原因是因为您尝试在文本字段（product.name）上进行聚合，而您无法在elasticsearch 5中执行此操作。您不需要将字段数据设置为true，您需要做的是在映射字段product中定义。将name命名为2个字段，一个product.name和第二个product.name.keyword 像这样：

{
 "product.name": 
      {
         "type" "text",
          "fields":
             {
                "keyword": 
                   { 
                     "type": "keyword",
                     "ignore_above": 256
                    }
             }
         }
   }

然后你需要在product.name.keyword上进行聚合

ELK查询以最大时间戳为每个产品返回一条记录

1 个答案: