Question

This is the layout of the webpage:

<h2>Featured Ads</h2>
<a href=""></a>

<h2>Ads</h2>
<a href=""></a>

There is nothing in the class of the regular Ads that I can use to differentiate them. What would be an efficient way to only return the <a href>'s that appear after <h2>Ads</h2>?

Update:

Here's the final code

h2 = soup.find("h2", text="Ads")
articles = h2.find_next_siblings("article")

for article in articles:
    for div in article.find_all('div', {'class': 'address'}):
        for link in div.find_all('a', href=True):
            print (link['href'])

Update 2: had to refactor...

articles = soup.find("h2", text="Ads").find_next_siblings("article")
for article in articles:
    ad_url = article.find('a', href=True)['href']

Answer 1

找到h2元素和find the next a sibling：

h2 = soup.find("h2", text="Ads")
a = h2.find_next_sibling("a")

如何获取<a href="">'s that appear after a specific </a> <h2> <a href="">?

1 个答案:

如何获取<a href="">&#39;s that appear after a specific </a> <h2> <a href="">?

1 个答案:

如何获取<a href="">'s that appear after a specific </a> <h2> <a href="">?