搜索时显示错误的数据

时间:2017-08-31 05:50:59

标签: elasticsearch

我的数据集超过一百万行。我使用logstash将elasticsearch与Mysql集成在一起。 当我输入以下URL以在邮递员中获取时,

http://localhost:9200/persondetails/Document/_search?q= *

我得到以下内容:

{
"took": 1,
"timed_out": false,
"_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
},
"hits": {
    "total": 2,
    "max_score": 1,
    "hits": [
        {
            "_index": "persondetails",
            "_type": "Document",
            "_id": "%{idDocument}",
            "_score": 1,
            "_source": {
                "iddocument": 514697,
                "@timestamp": "2017-08-31T05:18:46.916Z",
                "author": "vaibhav",
                "expiry_date": null,
                "@version": "1",
                "description": "ly that",
                "creation_date": null,
                "type": 1
            }
        },
        {
            "_index": "persondetails",
            "_type": "Document_count",
            "_id": "AV4o0J3OJ5ftvuhV7i0H",
            "_score": 1,
            "_source": {
                "query": {
                    "term": {
                        "author": "rishav"
                    }
                }
            }
        }
    ]
}

}

这是错误的,因为我的表中的行数超过100万,这表明总数只有2.我无法找到这里的错误。

当我输入http://localhost:9200/_cat/indices?v时 它显示了这个

  1. 健康:黄色

  2. 状态:打开

  3. index:persondetails

  4. uuid:4FiGngZcQfS0Xvu6IeHIfg

  5. pri:5

  6. rep:1

  7. docs.count:2

  8. docs.deleted:1054

  9. store.size:125.4kb

  10. pri.store.size:125.4kb

  11. 这是我的logstash.conf文件

    input {
    jdbc {
        jdbc_connection_string => "jdbc:mysql://127.0.0.1:3306/persondetails"
        jdbc_user => "root"
        jdbc_password => ""
        schedule => "* * * * *"
        jdbc_validate_connection => true
        jdbc_driver_library => "/usr/local/Cellar/logstash/5.5.2/mysql-connector-java-3.1.14/mysql-connector-java-3.1.14-bin.jar"
        jdbc_driver_class => "com.mysql.jdbc.Driver"
        statement => "SELECT * FROM Document"
        type => "persondetails"
    }
    }
    output {
    elasticsearch {
        #protocol=>http
        index =>"persondetails"
        document_type => "Document"
        document_id => "%{idDocument}"
        hosts => ["http://localhost:9200"]
        stdout{ codec => rubydebug}
    }
    }
    

1 个答案:

答案 0 :(得分:1)

从您的结果看,您的logstash配置存在问题,导致您的文档被覆盖,因为没有生成document_id,并且索引中只有一个文档ID为&# 34;%{idDocument}"

从结果中查看以下_source片段到您提供的搜索查询:

"_source": {
            "iddocument": 514697,
            "@timestamp": "2017-08-31T05:18:46.916Z",
            "author": "vaibhav",
            "expiry_date": null,
            "@version": "1",
            "description": "ly that",
            "creation_date": null,
            "type": 1
}

即使查看索引的小尺寸,也看起来没有更多文档。您应该查看您的jdbc输入是否提供了" idDocument"领域。