网络抓取td数据

时间:2014-10-20 18:25:03

标签: python selenium selenium-webdriver web-scraping

有人可以向我解释为什么我的代码没有拿到PriorSettle td' s?我得到了几个月,但无论出于什么原因,PrioSettle专栏都没有出现。

lc_result={}

url = "http://www.cmegroup.com/trading/agricultural/livestock/live-cattle.html"

driver = webdriver.Chrome() 
driver.set_window_size(2,2)
driver.get(url) #this will go the the actual url listed
print('     Live Cattle Futures'+localtime.center(50))
table = driver.find_element_by_id('quotesFuturesProductTable1')
for row in table.find_elements_by_tag_name('tr')[2:]:
    month=row.find_elements_by_tag_name('td')[0].text  
    priorsettle=row.find_elements_by_tag_name('td')[4].text

    print month, priorsettle
    lc_result[month]=[priorsettle]

driver.close()
print(str(date.today()))

1 个答案:

答案 0 :(得分:1)

您需要等待表加载。简单地添加延迟使它对我有用:

driver.get(url)

time.sleep(3)

table = driver.find_element_by_id('quotesFuturesProductTable1')
...

打印:

DEC 2014 168.025
FEB 2015 166.900
APR 2015 164.775
JUN 2015 154.800
AUG 2015 152.900
OCT 2015 154.100
DEC 2015 154.250
FEB 2016 153.850
APR 2016 0.000

仅供参考,使用time.sleep()的隐式超时不是可靠且推荐的等待元素的方法。 Selenium已内置Waits机制。