【问题标题】:requests.get(url) not returning for this specific urlrequests.get(url) 没有返回这个特定的 url
【发布时间】:2021-08-20 05:01:57
【问题描述】:

我正在尝试使用 requests.get(url).text 从该网站获取 HTML。但是,当使用此特定 url 调用 requests.get(url) 时,无论我等待多长时间,它都不会返回。这适用于其他网址,但这个特别给我带来了麻烦。代码如下

from bs4 import BeautifulSoup
import requests

source = requests.get('https://www.carmax.com/cars/all', allow_redirects=True).text

soup = BeautifulSoup(source, 'lxml')

print(soup.prettify().encode('utf-8'))

感谢您的帮助!

【问题讨论】:

    标签: python web python-requests screen-scraping


    【解决方案1】:

    试试:

    import requests
    from bs4 import BeautifulSoup
    
    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.76 Safari/537.36', "Upgrade-Insecure-Requests": "1","DNT": "1","Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8","Accept-Language": "en-US,en;q=0.5","Accept-Encoding": "gzip, deflate"}
    html = requests.get("https://www.carmax.com/cars/all",headers=headers)
    soup = BeautifulSoup(html.content, 'html.parser')
    print(soup.prettify())
    

    【讨论】:

    • 非常感谢,您能否简要解释一下我哪里出错了?
    • 您需要定义标题。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2017-03-29
    • 2022-01-13
    • 1970-01-01
    • 1970-01-01
    • 2017-06-16
    • 2015-12-16
    相关资源
    最近更新 更多