Question

我正在使用漂亮的汤来尝试解析网页中的信息：

url='https://www.onthemarket.com/for-sale/2-bed-flats-apartments/shortlands-station/?max-bedrooms=&radius=0.5'
req=requests.get(url)

要求返回<Response [403]>

Python requests. 403 Forbidden提示存在用户代理问题，但在我的实例中找不到。

有什么建议

Answer 1

在这种情况下，请使用包含user-agent

的标题

from bs4 import BeautifulSoup
import requests


url = 'https://www.onthemarket.com/for-sale/2-bed-flats-apartments/shortlands-station/?max-bedrooms=&radius=0.5'

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.84 Safari/537.36',
    'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
}

html_page = requests.get(url, headers=headers).text
soup = BeautifulSoup(html_page, "html.parser")

print(soup.text)

请求在python beautifulsoup中返回403

1 个答案: