不返回Href值

时间:2015-03-20 05:45:40

标签: python html selenium selenium-webdriver

我正在尝试从this页面中提取每个餐馆的网址,并为此编写了一个python脚本:

import time

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

browser = webdriver.Firefox()
browser.get("http://www.delyver.com/Partners/partner/HSR%20Layout,%20Bengaluru,%20Karnataka,%20India/12.9081357/77.64760799999999")

time.sleep(1)

elem = browser.find_element_by_tag_name("body")

no_of_pagedowns = 40

while no_of_pagedowns:
    elem.send_keys(Keys.PAGE_DOWN)
    time.sleep(0.2)
    no_of_pagedowns-=1

post1 = browser.find_elements_by_css_selector("Parwrsp.Parwrsp-Ado")


for post in post1:
    print post.get('href')  

当我运行脚本时,浏览器窗口打开,我最大化其窗口大小以获得焦点,并自动向下滚动。但没有任何印刷品。我在this链接后实施了selenium。

我做错了什么?

1 个答案:

答案 0 :(得分:0)

您当前的CSS选择器与任何元素都不匹配,因为Parwrsp是一个类。

如果要匹配多个类,请以这种方式编写选择器:

.Parwrsp.Parwrsp-Ado

而且,get()个实例上没有WebElement方法,您打算使用get_attribute()

posts = browser.find_elements_by_css_selector(".Parwrsp.Parwrsp-Ado")
for post in posts:
    print post.get_attribute('href')

证明上述意义:

>>> from selenium import webdriver
>>> 
>>> browser = webdriver.Firefox()
>>> browser.get("http://www.delyver.com/Partners/partner/HSR%20Layout,%20Bengaluru,%20Karnataka,%20India/12.9081357/77.64760799999999")
>>> for post in browser.find_elements_by_css_selector(".Parwrsp.Parwrsp-Ado"):
...     print post.get_attribute('href')
... 
http://www.delyver.com/Partners/partnerdetailsview/947/Purnabramha,-HSR
http://www.delyver.com/Partners/partnerdetailsview/916/Moti-Mahal-Deluxe,-HSR-Layout