使用多个find_all

时间:2018-09-02 14:15:16

标签: python beautifulsoup findall

我刚刚开始学习Python,我需要从https://www.congress.gov/bill/112th-congress抓取数百个国会法案。例如,我需要访问下面的H.R.6729。获得文本的HTML页面的结构为:

帐单     1。     H.R.6729-第112届国会(2011-2012)                      

因此将其隐藏在“ li”中,然后隐藏在“ span”中。该网页中的100个国会法案重复此操作。

我写的代码是:

import requests
from bs4 import BeautifulSoup
res = requests.get('https://www.congress.gov/bill/112th-congress', headers = {'User-agent': 'Chrome'})
soup = BeautifulSoup(res.text, 'html.parser')
bills = soup.find_all("li", {"class" : "expanded"})
len(bills) # this is 100 as there are 100 bills in the page
for bill in bills:
    bill_number = bill.find_all("span", {"class":"result-heading"})
len(bills) # this is giving me 1

我认为问题出在第二个find_all上,为什么输出仅是1个元素?

1 个答案:

答案 0 :(得分:0)

您必须进行变换

bill_number = bill.find_all("span", {"class":"result-heading"})

bill_number += bill.find_all("span", {"class":"result-heading"})