使用BeautifulSoup并从findAll函数打印字符串

时间:2011-10-08 19:26:04

标签: python beautifulsoup

我正在尝试从/ r / Askreddit获取主题标题。下面的代码返回None而不是线程标题。

from BeautifulSoup import BeautifulSoup
import urllib2, json

site='http://www.reddit.com/r/AskReddit/'

soup=BeautifulSoup(urllib2.urlopen(site))

questions=soup.findAll('p',{"class":"title"})


for i in questions:
        print i.string
        break

1 个答案:

答案 0 :(得分:1)

标题位于string标记的a属性中,而不是p标记。 另外,请注意title之后的空格:

questions=soup.findAll('a',{"class":"title "})

通过查看此HTML代码段找到了上述内容:

<p class="title"><a class="title " href="http://www.reddit.com/r/AskReddit/comments/l5157/whats_the_best_face_you_can_pull_before_and_after/">What's the best face you can pull? Before and after please.</a> <span class="domain">(<a href="http://www.reddit.com/r/AskReddit/">self.AskReddit</a>)</span></p>