python ijson大文件循环获取名称

时间:2016-04-23 04:47:10

标签: python json

我需要用ijson解析一个大的json文件(除非有更好的东西),我想循环遍历请求中的所有产品名称并打印出来。我尝试使用此支持页面进行设置。 https://pypi.python.org/pypi/ijson/

这是我得到的当前输出

<addinfourl at 140643118020800 whose fp = <socket._fileobject object at 0x7fea07882850>>
<generator object items at 0x7fea077dc910>
<generator object <genexpr> at 0x7fea077dc960>

我的代码

import json
import requests
import lxml 
import ijson
import urllib
from urllib import urlopen


request = urlopen('www.jsonurl.com')
objects = ijson.items(request, 'items.name')
products = (o for o in objects if o ['type' == 'name'])
for product in products:
    print product

print request
print objects
print products

这是一段json数据

{"query":"*","sort":"relevance","responseGroup":"base","totalResults":5158058,"start":1,"numItems":10,"items":[{"itemId":7933617,"parentItemId":7933617,"name":"Nordic Ware Heavyweight Scone / Cornbread Pan","msrp":26.97,"salePrice":20.42,"upc":"011172016409","categoryPath":"Home/Kitchen & Dining/Cookware, Bakeware & Tools/Specialty Cookware","shortDescription":"&lt;p&gt;This Nordic Ware Scone Pan is made of a heavyweight cast aluminum. It can be used as a heavyweight scone or cornbread pan, and it is designed to cook your meal evenly and thoroughly. It features a non-stick interior coating for easy release and clean up.&lt;/p&gt;","longDescription":"&lt;b&gt;Nordic Ware Heavyweight Scone/Cornbread Pan:&lt;/b&gt;&lt;ul&gt;&lt;li&gt;Heavyweight cast aluminum&lt;/li&gt;&lt;li&gt;Ideal for scones and cornbread&lt;/li&gt;&lt;li&gt;Eight wedges&lt;/li&gt;&lt;li&gt;Cooks evenly and thoroughly&lt;/li&gt;&lt;li&gt;Non-stick interior coating for easy release and clean-up&lt;/li&gt;&lt;/ul&gt;","thumbnailImage":"http://i5.walmartimages.com/dfw/dce07b8c-c739/k2-_6fb32a28-c090-4377-81d5-e83273124841.v1.jpg","mediumImage":"http://i5.walmartimages.com/dfw/dce07b8c-ddb3/k2-_6f7df9fa-cb2d-4faf-afbc-8fa4185add59.v1.jpg","largeImage":"http://i5.walmartimages.com/dfw/dce07b8c-5bd3/k2-_6635f62a-5e0b-4c4e-a93d-ee85643f7397.v1.jpg","productTrackingUrl":"http://linksynergy.walmart.com/fs-bin/click?id=|LSNID|&offerid=223073.7200&type=14&catid=8&subid=0&hid=7200&tmpid=1082&RD_PARM1=http%253A%252F%252Fwww.walmart.com%252Fip%252FNordicWare-Heavyweight-Scone-Cornbread-Pan%252F7933617%253Faffp1%253DpjiPu5Y7cvNmz4xZOAs5j7QlW2mZPVmc1DR3BvmrkB4%2526affilsrc%253Dapi","standardShipRate":4.97,"marketplace":false,"modelNumber":"1640","productUrl":"http://c.affil.walmart.com/t/api02?l=http%3A%2F%2Fwww.walmart.com%2Fip%2FNordicWare-Heavyweight-Scone-Cornbread-Pan%2F7933617%3Faffp1%3DpjiPu5Y7cvNmz4xZOAs5j7QlW2mZPVmc1DR3BvmrkB4%26affilsrc%3Dapi%26veh%3Daff%26wmlspartner%3Dreadonlyapi","customerRating":"4.7","numReviews":20,"customerRatingImage":"http://i2.walmartimages.com/i/CustRating/4_7.gif","categoryNode":"4044_623679_133020","bundle":false,"stock":"Available","addToCartUrl":"http://c.affil.walmart.com/t/api02?l=http%3A%2F%2Faffil.walmart.com%2Fcart%2FaddToCart%3Fitems%3D7933617%7C1%26affp1%3DpjiPu5Y7cvNmz4xZOAs5j7QlW2mZPVmc1DR3BvmrkB4%26affilsrc%3Dapi%26veh%3Daff%26wmlspartner%3Dreadonlyapi","affiliateAddToCartUrl":"http://linksynergy.walmart.com/fs-bin/click?id=|LSNID|&offerid=223073.7200&type=14&catid=8&subid=0&hid=7200&tmpid=1082&RD_PARM1=http%253A%252F%252Faffil.walmart.com%252Fcart%252FaddToCart%253Fitems%253D7933617%257C1%2526affp1%253DpjiPu5Y7cvNmz4xZOAs5j7QlW2mZPVmc1DR3BvmrkB4%2526affilsrc%253Dapi","giftOptions":

1 个答案:

答案 0 :(得分:1)

您在我们的输出中看到的是:

print request:打开与网址的连接 - 这似乎是正确的,并不奇怪

print objects:正如输出所示,它是一个生成器,你可能会期望列表 值。但是由于对象实际上是一个生成器(你使用ijson要求这个),你应该 消耗它的值。通过list(objects)

进行典型操作

print products:也是一个生成器,但这次是列表理解的结果。正如您使用的() 在表达式周围,你问了一个发电机。如果您使用[o for o in objects if o ['type' == 'name']],您将直接获得列表。解决方案与objects一样:使用 值,例如list(products)

请注意,一旦您从生成器中消耗了一个值(或所有值),它们就会消失 生成器保持其私有内部状态,每次调用都会改变它。

有关详情,请参阅SO问题Convert generator object to list for debugging