亚马逊弹性搜索服务ConnectionTimeout

时间:2017-02-27 08:10:35

标签: elasticsearch bulk connection-timeout

from elasticsearch import Elasticsearch
from elasticsearch import helpers
es_url = '*****.us-east-1.es.amazonaws.com'
# es_conn = Elasticsearch(es_url)
es_conn = Elasticsearch([{'host': es_url, 'port': 443, 'use_ssl': True}])
while 1:
    for ....:
        actions.append(....)
        if len(actions) >= 5000:
            helpers.bulk(es_conn, actions)
            actions = []
    helpers.bulk(es_conn, actions)

上面的代码在ec2实例上运行,它经常抛出以下错误:

    helpers.bulk(es_conn, actions)
  File "/usr/local/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 194, in bulk
    for ok, item in streaming_bulk(client, actions, **kwargs):
  File "/usr/local/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 162, in streaming_bulk
    for result in _process_bulk_chunk(client, bulk_actions, raise_on_exception, raise_on_error, **kwargs):
  File "/usr/local/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 91, in _process_bulk_chunk
    raise e
ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPSConnectionPool(host='search-shinezoneels-pc3ib5rkhuylqynfoz6rph7gh4.us-east-1.es.amazonaws.com', port=443): Read timed out.)

同时,我在另一个EMR实例上运行代码,错误根本没发生。 ec2实例上的批量速度大约是EMR实例的两倍,但通常是错误的。怎么解决?

0 个答案:

没有答案