使用Python和elasticsearch,我如何遍历返回的JSON对象?

时间:2016-07-26 17:42:08

标签: python json elasticsearch

我的代码如下:

import json
from elasticsearch import Elasticsearch

es = Elasticsearch()

resp = es.search(index="mynewcontacts", body={"query": {"match_all": {}}})
    response = json.dumps(resp)
    data = json.loads(response)
    #print data["hits"]["hits"][0]["_source"]["email"]
    for row in data:
    print row["hits"]["hits"][0]["_source"]["email"]
    return "OK"

生成此截断(为方便起见)JSON:

{"timed_out": false, "took": 1, "_shards": {"successful": 5, "total": 5, "failed": 0}, "hits": {"max_score": 1.0, "total": 7, "hits": [{"_index": "mynewcontacts", "_type": "contact", "_score": 1.0, 
"_source": {"email": "sharon.zhuo@xxxxx.com.cn", "position": "Sr.Researcher", "last": "Zhuo", "first": "Sharon", "company": "Tabridge Executive Search"}, "_id": "AVYmLMlKJVSAh7zyC0xf"},
{"_index": "mynewcontacts", "_type": "contact", "_score": 1.0, "_source": {"email": "andrew.springthorpe@xxxxx.gr.jp", "position": "Vice President", "last": "Springthorpe", "first": "Andrew", "company": "SBC Group"}, "_id": "AVYmLMlRJVSAh7zyC0xg"}, {"_index": "mynewcontacts", "_type": "contact", "_score": 1.0, "_source": {"email": "mjbxxx@xxx.com", "position": "Financial Advisor", "last": "Bell", "first": "Margaret Jacqueline", "company": "Streamline"}, "_id": "AVYmLMlXJVSAh7zyC0xh"}, {"_index": "mynewcontacts", "_type": "contact", "_score": 1.0, "_source": {"email": "kokaixxx@xxxx.com", "position": "Technical Solutions Manager MMS North Asia", "last": "Okai", "first": "Kensuke", "company": "Criteo"}, "_id": "AVYmLMlfJVSAh7zyC0xi"}, {"_index": "mynewcontacts", "_type": "contact", "_score": 1.0, "_source": {"email": "mizuxxxxto@zszs.com", "position": "Sr. Strategic Account Executive", "last": "Kato", "first": "Mizuto", "company": "Twitter"}, "_id": "AVYmLMlkJVSAh7zyC0xj"}, {"_index": "mynewcontacts", "_type": "contact", "_score": 1.0, "_source": {"email": "abc@example.com", "position": "Design Manager", "last": "Okada", "first": "Kengo", "company": "ON Semiconductor"}, "_id": "AVYmLMlpJVSAh7zyC0xk"}, {"_index": "mynewcontacts", "_type": "contact", "_score": 1.0, "_source": {"email": "007@example.com", "position": "Legal Counsel", "last": "Lei", "first": "Yangzi (Karen)", "company": "Samsung China Semiconductor"}, "_id": "AVYmLMkUJVSAh7zyC0xe"}]}}

当我尝试:

print data["hits"]["hits"][0]["_source"]["email"]

它会打印第一封电子邮件,但是当我尝试使用

进行循环时
for row in data:
    print row["hits"]["hits"][0]["_source"]["email"]

我收到错误:

TypeError: string indices must be integers

有人可以建议我如何正确迭代这些项目吗?非常感谢!

3 个答案:

答案 0 :(得分:2)

你正在做的是循环通过字典的键。要在响应中打印每封电子邮件,请执行以下操作:

for row in data["hits"]["hits"]:
    print row["_source"]["email"]

也没有必要转换为json。这应该完成你想要做的事情:

from elasticsearch import Elasticsearch

es = Elasticsearch()

resp = es.search(index="mynewcontacts", body={"query": {"match_all": {}}})
for row in resp["hits"]["hits"]:
    print row["_source"]["email"]
return "OK"

答案 1 :(得分:1)

我可能错了,但看起来你可能没有根据正确的json项启动for循环。尝试:

for row in data['hits']['hits']:
    # Rest of loop here.

答案 2 :(得分:1)

您检索到的响应data是一个Python字典 - 如果您对其进行for循环,它将生成字典键 - 在这种情况下,teh strigns timed_out,{{ 1}},took等......

很明显,你想迭代你的回复数据中shards位置提供的列表。这是一个清单。

所以,只需做

data["_shards"]["hits"]["hits"]
相关问题