Question

我有下面的代码，除了我无法弄清楚如何检索结果的描述之外，它可以工作。我尝试了几个不同的 div，但它似乎不起作用。有人可以帮忙吗？使用soup.find_all 获取data_descr 的合适div 是什么？

    q = '+'.join(q.split())
    url = 'https://www.google.com/search?q=' + q + '&ie=utf-8&oe=utf-8'
    r = s.get(url, headers=headers_Get)

    soup = BeautifulSoup(r.text, 'lxml')
    #print (soup.prettify())
    data_text = soup.find_all('div', attrs={'class':'BNeawe vvjwJb AP7Wnd'})
    data_link = soup.find_all('div', attrs={'class':'kCrYT'})
    data_descr = soup.find_all('div',attrs={'class':'VwiC3b yXK7lf MUxGbd yDYNvb lyLwlc'}) #select(".s3v9rd.AP7Wnd")
    print (data_descr)

Answer 1

对于描述，你选择正确的 div 类，你必须得到这样的文本

desciption_div = soup.find_all('div',{'class':'VwiC3b yXK7lf MUxGbd yDYNvb lyLwlc'})
for desciption in desciption_div:
    dscrpt = desciption.get_text()
    print(dscrpt)

但请记住，这个 div 也有 span 标签，这些文本实际存在的地方也只有这些文本存在，所以直接在 div 上使用。

获取谷歌结果 + 结果描述

1 个答案: