为什么

Question

我是Beautifulsoup和Python的新手，试图找出如何获取HTML页面的第一个标签。有人能告诉我我的代码有什么问题吗？

HTML

<th width="10%">1365 m</th>
<th width="15%">Rating 25-0</th>
<th width="10%">12h45</th>

我想只检索第一个宽度为1365的宽度。下面是我的代码

print('Track '+soup.findAll('th',{'width':'10%'})[3])

我尝试了find('th',{'width':'10%'})[3])，但它正在抛出索引绑定异常。有帮助吗？用我的代码我得到第二个标签，即12h45

Answer 1

print(soup.findAll('th')[0])

这是第一个。

计算机从0,1,2,3 .... n开始计数。如果你想打印最后一个

print(soup.findAll('th')[1])

soup.findAll('th',{'width':'10%'})[3] 无法工作。

我们正在寻找width 10%

的全部

在这个HTML中只有两个。

      <th width="10%">1365 m</th>
      <th width="15%">Rating 25-0</th>
      <th width="10%">12h45</th>

最好的方法是打印这个：

for i in soup.findAll('th',{'width':'10%'}):
    print(i)

Answer 2

soup.findAll('th',{'width':'10%'})[3]应该是：

# Get the all matching 'th' that also has 'width' set to  '10%', access the first match
print('Track '+soup.findAll('th',{'width':'10%'})[0])

或者如果您只想访问第一场比赛：

# Get the first 'th' with 'width' '10%'
soup.find('th',{'width':'10%'})