【发布时间】:2018-05-24 04:50:06
【问题描述】:
我尝试使用漂亮的汤(bs4)来抓取页面,但是在抓取数据时遇到了问题,我什至提到了这个答案中指出的标题Stackoverflow Question 这是我的代码
from bs4 import BeautifulSoup
import requests
headers = {
'Referer': 'hello',
}
r=requests.get
('https://www.doamin.com/bangalore/restaurants',headers=headers)
print(r.status_code)
这是我遇到的错误
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))
还有这个
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without
response
我什至尝试过使用用户代理
import requests
url = 'https://www.example.com/bangalore/restaurants'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36
(KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36'}
response = requests.get(url, headers=headers)
print(response.content)
但仍然出现同样的错误!
谁能帮帮我?
【问题讨论】:
-
似乎服务器正在中止您的请求。您可能需要添加一些额外的标题,例如
User-Agent等。另外请不要添加您正在尝试的域名
标签: python web-scraping beautifulsoup python-requests