获取Beautiful Soup中的链接

时间:2017-07-13 03:18:56

标签: beautifulsoup

我试图解析Beautiful Soup中的以下链接,我不确定这样做的最佳方法是什么。任何建议都将不胜感激。

由于

enter image description here

1 个答案:

答案 0 :(得分:0)

如果有人对此感兴趣,我想出了如何做到这一点:

 from bs4 import BeautifulSoup
xml = requests.get("http://www.realclearpolitics.com/epolls/2010/governor/2010_elections_governor_map.html").text
def find_governor_races(html):
    soup = BeautifulSoup(html, 'html.parser')
    pattern = "http://www.realclearpolitics.com/epolls/????/governor/??/*-*.html"

    links = []

    for option in soup.find_all('option'):
        links.append(option['value'])

    matched_links = []

    for link in links:
        if fnmatch(link, pattern):
            matched_links.append(link)

    return matched_links