美丽的汤什么也没有回来

时间:2013-06-04 11:53:55

标签: python python-2.7 beautifulsoup urllib2

您好我正在为我的学校开展一个项目,涉及清除HTML。

然而,当我寻找桌子时,我没有得到任何回报。以下是遇到此问题的细分受众群。

如果您需要更多信息,我很乐意将其提供给您

from bs4 import BeautifulSoup
import urllib2
import datetime

#This section determines the date of the next Saturday which will go onto the end of     the URL 
d = datetime.date.today() 
while d.weekday() != 5:
    d += datetime.timedelta(1)

#temporary logic for testing when next webpage isn't out
d = "2013-06-01"

#Section that scrapes the data off the webpage
url = "http://www.sydgram.nsw.edu.au/co-curricular/sport/fixtures/" + str(d) + ".php"
page = urllib2.urlopen(url)
soup = BeautifulSoup(page)
print soup
#Section that grabs the table with stuff in it
table = soup.find('table', {"class": "excel1"})
print table

1 个答案:

答案 0 :(得分:0)

BeautifulSoup期待一个HTML字符串。你提供的是一个响应对象。

从响应中获取html:

 html = page.read()

然后将html移到beautifulsoup或直接传递给你。

另外,建议您阅读以下两个链接:

urllib2 documentation

BeautifulSoup documentation