python请求有时会返回空列表

时间:2017-07-27 22:07:56

标签: python selenium xpath lxml screen-scraping

所以我一直试图从“2005年之间的饮料”中榨取“2005 - 2013” 一开始这个代码对我有用,但现在我只返回空列表,但我的请求仍然有200个状态代码

import requests, lxml.html, csv
headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) 
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36'}
page = requests.get('http://www.cellartracker.com/wine.asp?
iWine=91411',headers=headers)
print(page.status_code)
html = lxml.html.fromstring(page.content)
content_divs = html.xpath('//a[@title="Source: Community"]/text()')
print(content_divs)

不确定我是否应该开始使用selenium进行此抓取,因为它是一个js站点,如果是这样,不知道如何做到这一点,所以一些基本的帮助将是有用的! 谢谢!

1 个答案:

答案 0 :(得分:1)

使用硒

from selenium import webdriver
url = "https://www.cellartracker.com/wine.asp?iWine=91411"

driver = webdriver.Chrome(executable_path="chromedriver2.25")
driver.get(url)
list = driver.find_elements_by_xpath("//li[contains(.,'review')]")
for item in list:
    print(item.text)
    print("---")

输出:

Options
1/4/2014 - REUBENSHAPCOTT WROTE:
91 Points
Delicious! Had no idea that Australia made port this good, and affordable. Terrific, smooth fig and plum. Aged and neither sharp nor grapey. If you see it, buy it.
Do you find this review helpful? Yes - No / Comment
---
Options
1/20/2013 - LISAADAM WROTE:
85 Points
The wine looks Tawny colored.
Do you find this review helpful? Yes - No / Comment
---
Options
12/22/2012 - WINEAGGREGATE LIKES THIS WINE:
90 Points
Molasses, pepper, butterscotch candy that's been roasted a bit. Very nice.
Do you find this review helpful? Yes - No / Comment
---
Options
10/30/2011 - GTI2TON WROTE:
87 Points
Sweeter than average tawny and straightforward, but still has nice richness in its raisin and light carmel notes. Good value.
Do you find this review helpful? Yes - No / Comment