python在string中查找标记的索引

时间:2012-01-13 20:01:52

标签: python beautifulsoup

HTML

<div class="productDescriptionWrapper">
<p>A worm worth getting your hands dirty over. With over six feet of crawl space, Playhut&rsquo;s Wiggly Worm is a brightly colored and friendly play structure.
</p>
<ul>  
   <li>6ft of crawl through fun</li>    
   <li>18&rdquo; diameter for easy crawl through</li>    
   <li>Bright colorful design</li>    
   <li>Product Measures: 18&quot;&quot;Diam x 60&quot;&quot;L</li>    
   <li>Recommended Ages: 3 years &amp; up<br />    &nbsp;</li>
</ul>
<p><strong>Intended for Indoor Use</strong></p>

代码

def GetBullets(self, Soup):

    bulletList = []

    bullets = str(Soup.findAll('div', {'class': 'productDescriptionWrapper'}))

    bullets_re = re.compile('<li>(.*)</li>')

    bullets_pat = str(re.findall(bullets_re, bullets))

    index = bullets_pat.findall('</li>')

    print index

如何提取p代码和li代码?谢谢!

2 个答案:

答案 0 :(得分:3)

请注意以下事项:

>>> from BeautifulSoup import BeautifulSoup
>>> html = """ <what you have above> """
>>> Soup = BeautifulSoup(html)
>>> bullets = Soup.findAll('div', {'class': 'productDescriptionWrapper'})
>>> ptags = bullets[0].findAll('p')
>>> print ptags
[<p>A worm worth getting your hands dirty over. With over six feet of crawl space,      Playhut&rsquo;s Wiggly Worm is a brightly colored and friendly play structure.
</p>, <p><strong>Intended for Indoor Use</strong></p>]
>>> print ptags[0].text
A worm worth getting your hands dirty over. With over six feet of crawl space, Playhut&rsquo;s Wiggly Worm is a brightly colored and friendly play structure.

您可以以类似的方式获取li标签的内容。

答案 1 :(得分:0)

我们使用Beautiful Soup

相关问题