批量插入时的弹性搜索内存错误

时间:2015-05-15 01:20:37

标签: python-2.7 elasticsearch encode

我一次将5000条记录插入弹性搜索中 这些记录的总大小是:33936(我使用sys.getsizeof()得到了这个

弹性搜索版本:1.5.0 Python 2.7 Ubuntu的

以下是错误

Traceback (most recent call last):
  File "run_indexing.py", line 67, in <module>
    index_policy_content(datatable, source, policyids)
  File "run_indexing.py", line 60, in index_policy_content
    bulk(elasticsearch_instance, actions)
  File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers.py", line 148, in bulk
    for ok, item in streaming_bulk(client, actions, **kwargs):
  File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers.py", line 107, in streaming_bulk
    resp = client.bulk(bulk_actions, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/elasticsearch/client/utils.py", line 70, in _wrapped
    return func(*args, params=params, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/elasticsearch/client/__init__.py", line 568, in bulk
    params=params, body=self._bulk_body(body))
  File "/usr/local/lib/python2.7/dist-packages/elasticsearch/transport.py", line 259, in perform_request
    body = body.encode('utf-8')
MemoryError

请帮我解决问题。

谢谢&amp;问候, Afroze

1 个答案:

答案 0 :(得分:0)

如果我不得不猜测,我说这个内存错误发生在python中,因为它加载并序列化了它的数据。尝试切断批量大小,直到你得到一些有用的东西,然后二进制搜索,直到它再次失败。这应该可以帮助您找出一个安全的批量大小。

(您可能希望包含的其他有用信息:您正在运行python进程的服务器中的内存量,elasticsearch服务器节点的内存量,分配给Java的堆量。)