请求在 python beautifulsoup 中返回 403答案

【问题标题】：request returns 403 in python beautifulsoup请求在 python beautifulsoup 中返回 403
【发布时间】：2020-02-11 12:45:59
【问题描述】：

我正在使用漂亮的汤来尝试解析网页中的信息：

url='https://www.onthemarket.com/for-sale/2-bed-flats-apartments/shortlands-station/?max-bedrooms=&radius=0.5'
req=requests.get(url)

req 返回<Response [403]>

Python requests. 403 Forbidden 表明存在用户代理问题，但在我的实例中找不到。

有什么建议

【问题讨论】：

我注意到请求中设置了标头cookie: logglytrackingsession=<MY-COOKIE>。服务器可能会拒绝没有跟踪 cookie 的请求，该 cookie 在浏览器中加载时设置。
可能是@JammyDodger 提到的，可能是您提到的用户代理，请检查您的浏览器在访问网站时发送的标头。
@luis，这是标题。谢谢

标签： python beautifulsoup request

【解决方案1】：

在这种情况下，请使用包含user-agent 的标题

from bs4 import BeautifulSoup
import requests


url = 'https://www.onthemarket.com/for-sale/2-bed-flats-apartments/shortlands-station/?max-bedrooms=&radius=0.5'

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.84 Safari/537.36',
    'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
}

html_page = requests.get(url, headers=headers).text
soup = BeautifulSoup(html_page, "html.parser")

print(soup.text)

【讨论】：

并不总是用户代理问题，那该怎么办？