使用python API在弹性搜索中转储批量数据

时间:2019-03-16 08:03:25

标签: elasticsearch

我想使用其python API在弹性搜索中为莎士比亚数据编制索引。我遇到错误了。

    PUT http://localhost:9200/shakes/play/3 [status:400 request:0.098s]
{'error': {'root_cause': [{'type': 'mapper_parsing_exception', 'reason': 'failed to parse'}], 'type': 'mapper_parsing_exception', 'reason': 'failed to parse', 'caused_by': {'type': 'not_x_content_exception', 'reason': 'Compressor detection can only be called on some xcontent bytes or compressed xcontent bytes'}}, 'status': 400}

python脚本

from elasticsearch import Elasticsearch
from elasticsearch import TransportError
import json

data = []

for line in open('shakespeare.json', 'r'):
    data.append(json.loads(line))

es = Elasticsearch()

res = 0
cl = []
# filtering data which i need
for d in data:
    if res == 0:
        res = 1 
        continue
    cl.append(data[res])
    res = 0

try:
    res = es.index(index = "shakes", doc_type = "play", id = 3, body = cl)
    print(res)
except TransportError as e:
    print(e.info)

我也尝试使用json.dumps,但仍然遇到相同的错误。但是,当仅将列表的一个元素添加到弹性搜索中时,下面的代码有效。

1 个答案:

答案 0 :(得分:1)

您不是要向es发送批量请求,而只是发送一个简单的创建请求-请查看here。此方法适用于代表新文档的字典,而不适用于文档列表。如果在创建请求中添加了ID,则需要使该值动态化,否则每个文档都将在最后指示的文档的ID上被覆盖。如果在json中,您有记录,则应尝试每行-请阅读here以获得批量文档:

  from elasticsearch import helpers

es = Elasticsearch()
op_list = []
with open("C:\ElasticSearch\shakespeare.json") as json_file:
    for record in json_file:
        op_list.append({
                       '_op_type': 'index',
                       '_index': 'shakes',
                       '_type': 'play',
                       '_source': record
                     })
helpers.bulk(client=es, actions=op_list)