Elasticsearch慢查询和高CPU使用率

时间:2019-10-05 11:03:51

标签: elasticsearch

我有一台具有32G RAM,8个CPU内核和SSD的机器。 Elasticsearch 5.3.0(也在2.2上尝试过) 一个索引中约有1500万份文档。性能非常慢,每个请求大约需要5s。我试图更改硬件等,结果相同。大约100%的CPU使用率,查询缓慢。我最终在同一台计算机上通过3个节点(通过docker-compose配置)的集群中。平均响应时间3-4秒。

日志

有趣的是,总请求时间远高于所有子查询时间的总和。

bash-4.3# curl -X GET "localhost:9200/_search?pretty" -H 'Content-Type: application/json' -d'
> {
>     "query": {
>         "match" : {
>             "Descr" : {
>                 "query" : "одежда"
>             }
>         }
>     },
> "profile": true,
> "size":0
>
> }'
{
  "took" : 2834,
  "timed_out" : false,
  "_shards" : {
    "total" : 18,
    "successful" : 18,
    "failed" : 0
  },
  "hits" : {
    "total" : 186229,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "profile" : {
    "shards" : [
      {
        "id" : "[12MdnUrXQR6uazc_BQ91Kw][.monitoring-es-2-2019.10.05][0]",
        "searches" : [
          {
            "query" : [
              {
                "type" : "TermQuery",
                "description" : "Descr:одежда",
                "time" : "0.07889800000ms",
                "time_in_nanos" : 78898,
                "breakdown" : {
                  "score" : 0,
                  "build_scorer_count" : 5,
                  "match_count" : 0,
                  "create_weight" : 6027,
                  "next_doc" : 0,
                  "match" : 0,
                  "create_weight_count" : 1,
                  "next_doc_count" : 0,
                  "score_count" : 0,
                  "build_scorer" : 72865,
                  "advance" : 0,
                  "advance_count" : 0
                }
              }
            ],
            "rewrite_time" : 11201,
            "collector" : [
              {
                "name" : "CancellableCollector",
                "reason" : "search_cancelled",
                "time" : "0.4055740000ms",
                "time_in_nanos" : 405574,
                "children" : [
                  {
                    "name" : "TotalHitCountCollector",
                    "reason" : "search_count",
                    "time" : "0.008022000000ms",
                    "time_in_nanos" : 8022
                  }
                ]
              }
            ]
          }
        ],
        "aggregations" : [ ]
      },
    ...

完整日志:https://pastebin.com/M13JnBrF

bash-4.3# curl localhost:9200/_nodes/hot_threads
::: {12MdnUr}{12MdnUrXQR6uazc_BQ91Kw}{ivsBOkyLS_qslFwYzjvlGA}{172.22.0.10}{172.22.0.10:9300}
   Hot threads at 2019-10-05T10:40:13.213Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:

   25.2% (126ms out of 500ms) cpu usage by thread 'elasticsearch[12MdnUr][search][T#7]'
     2/10 snapshots sharing following 26 elements
       org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.refillDocs(Lucene50PostingsReader.java:357)
       org.apache.lucene.codecs.lucene50.Lucene50PostingsReader$BlockDocsEnum.advance(Lucene50PostingsReader.java:441)
       org.apache.lucene.search.DisjunctionDISIApproximation.advance(DisjunctionDISIApproximation.java:66)
       org.apache.lucene.search.DisjunctionDISIApproximation.advance(DisjunctionDISIApproximation.java:66)
       org.apache.lucene.search.DisjunctionDISIApproximation.advance(DisjunctionDISIApproximation.java:66)
       org.apache.lucene.search.ReqOptSumScorer.score(ReqOptSumScorer.java:78)
       org.apache.lucene.search.ConjunctionScorer.score(ConjunctionScorer.java:64)
       org.apache.lucene.search.TopScoreDocCollector$SimpleTopScoreDocCollector$1.collect(TopScoreDocCollector.java:64)
       org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:221)
       org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:172)
       org.apache.lucene.search.BulkScorer.score(BulkScorer.java:39)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:669)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:473)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:397)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:108)
       org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:247)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:261)
       org.elasticsearch.action.search.SearchTransportService$6.messageReceived(SearchTransportService.java:331)
       org.elasticsearch.action.search.SearchTransportService$6.messageReceived(SearchTransportService.java:328)
       org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69)
       org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:618)
       org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:613)


       ....

完整日志https://pastebin.com/fuhzCPiF

bash-4.3# curl localhost:9200/_cat/thread_pool/search?v&h=node_name,name,active,rejected,completed
[1] 214
bash-4.3# node_name name   active queue rejected
nW1NCo0   search      1     0        0
_wVV_Q2   search     13    76        0
12MdnUr   search     13    57        0
bash-4.3# curl localhost:9200/_cat/shards
products                    1 r STARTED 3471828   5.2gb 172.22.0.8  nW1NCo0
products                    1 p STARTED 3471828   5.2gb 172.22.0.10 _wVV_Q2
products                    3 p STARTED 3464893   7.2gb 172.22.0.7  12MdnUr
products                    3 r STARTED 3464893   7.2gb 172.22.0.10 _wVV_Q2
products                    4 r STARTED 3476765   5.9gb 172.22.0.8  nW1NCo0
products                    4 p STARTED 3476765   5.9gb 172.22.0.7  12MdnUr
products                    2 p STARTED 3473709   5.6gb 172.22.0.7  12MdnUr
products                    2 r STARTED 3473709   5.6gb 172.22.0.10 _wVV_Q2
products                    0 r STARTED 3475382   5.3gb 172.22.0.8  nW1NCo0
products                    0 p STARTED 3475382   5.3gb 172.22.0.10 _wVV_Q2
cats                        1 p STARTED    7808   1.5mb 172.22.0.7  12MdnUr
cats                        1 r STARTED    7808   1.5mb 172.22.0.10 _wVV_Q2
cats                        3 p STARTED    7721   1.5mb 172.22.0.7  12MdnUr
cats                        3 r STARTED    7721   1.5mb 172.22.0.10 _wVV_Q2
cats                        4 r STARTED    7547   1.5mb 172.22.0.8  nW1NCo0
cats                        4 p STARTED    7547   1.5mb 172.22.0.7  12MdnUr
cats                        2 r STARTED    7557   1.5mb 172.22.0.8  nW1NCo0
cats                        2 p STARTED    7557   1.5mb 172.22.0.10 _wVV_Q2
cats                        0 r STARTED    7735   1.5mb 172.22.0.8  nW1NCo0
cats                        0 p STARTED    7735   1.5mb 172.22.0.10 _wVV_Q2
.monitoring-es-2-2019.10.04 0 p STARTED    5925   2.8mb 172.22.0.7  12MdnUr
.monitoring-es-2-2019.10.04 0 r STARTED    5925   2.8mb 172.22.0.10 _wVV_Q2
.monitoring-es-2-2019.10.05 0 r STARTED   51448  95.3mb 172.22.0.8  nW1NCo0
.monitoring-es-2-2019.10.05 0 p STARTED   51448    95mb 172.22.0.7  12MdnUr
mycats                      1 r STARTED       0    159b 172.22.0.8  nW1NCo0
mycats                      1 p STARTED       0    159b 172.22.0.10 _wVV_Q2
mycats                      3 r STARTED       0    159b 172.22.0.8  nW1NCo0
mycats                      3 p STARTED       0    159b 172.22.0.7  12MdnUr
mycats                      4 r STARTED       0    159b 172.22.0.8  nW1NCo0
mycats                      4 p STARTED       0    159b 172.22.0.7  12MdnUr
mycats                      2 p STARTED       0    159b 172.22.0.8  nW1NCo0
mycats                      2 r STARTED       0    159b 172.22.0.10 _wVV_Q2
mycats                      0 p STARTED       0    159b 172.22.0.7  12MdnUr
mycats                      0 r STARTED       0    159b 172.22.0.10 _wVV_Q2
.monitoring-data-2          0 r STARTED       5 253.9kb 172.22.0.8  nW1NCo0
.monitoring-data-2          0 p STARTED       5 253.9kb 172.22.0.7  12MdnUr
top - 12:59:14 up 14:33,  9 users,  load average: 34.30, 32.63, 31.71
Tasks: 202 total,   1 running, 137 sleeping,   0 stopped,   0 zombie
%Cpu(s): 98.9 us,  0.8 sy,  0.0 ni,  0.1 id,  0.0 wa,  0.0 hi,  0.2 si,  0.0 st
KiB Mem : 32165704 total,   450184 free,  9569900 used, 22145620 buff/cache
KiB Swap:        0 total,        0 free,        0 used. 22144364 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
  847 1000      20   0 27.269g 5.917g 3.384g S 272.8 19.3  17:52.69 java
  696 1000      20   0 22.701g 5.139g 2.567g S 266.4 16.8  17:57.38 java
 1013 1000      20   0 20.437g 4.965g 2.419g S 222.9 16.2  13:39.35 java
  629 root      20   0 1685480 598716  26624 S  20.3  1.9   1:08.04 smartshopper
31905 root      20   0  119024 115836   7936 S  13.0  0.4   0:53.38 caddy
31816 1000      20   0   75800  68680    552 S   1.3  0.2   0:06.24 memcached
 1920 root      20   0   44556   4060   3420 R   0.3  0.0   0:00.05 top
 7155 root      20   0 2606652  89212  17256 S   0.3  0.3   8:55.51 dockerd
31876 root      20   0 1244108  27372  13996 S   0.3  0.1   0:02.59 influxd

0 个答案:

没有答案
相关问题