为什么我在尝试抓取特定网站时会收到“连接中止”错误？答案

【问题标题】：Why do I get a "Connection aborted" error when trying to crawl a specific website?为什么我在尝试抓取特定网站时会收到“连接中止”错误？
【发布时间】：2016-04-24 16:17:38
【问题描述】：

我用 Python 2.7 写了一个网络爬虫，但是在浏览器中可以查看特定站点却无法下载。

我的代码如下：

# -*- coding: utf-8 -*-

import requests

# OK
url = 'http://blog.ithome.com.tw/'
url = 'http://7club.ithome.com.tw/'
url = 'https://member.ithome.com.tw/'
url = 'http://ithome.com.tw/'
url = 'http://weekly.ithome.com.tw'

# NOT OK
url = 'http://download.ithome.com.tw'
url = 'http://apphome.ithome.com.tw/'
url = 'http://ithelp.ithome.com.tw/'

try:
    response = requests.get(url)
    print 'OK!'
    print 'response.status_code: %s' %(response.status_code)

except Exception, e:
    print 'NOT OK!'
    print 'Error: %s' %(e)
print 'DONE!'
print 'response.status_code: %s' %(response.status_code)

每次我尝试都会收到此错误：

C:\Python27\python.exe "E:/python crawler/test_ConnectionFailed.py"
NOT OK!
Error: ('Connection aborted.', BadStatusLine("''",))
DONE!
Traceback (most recent call last):
  File "E:/python crawler/test_ConnectionFailed.py", line 29, in <module>
    print 'response.status_code: %s' %(response.status_code)
NameError: name 'response' is not defined

Process finished with exit code 1

为什么会发生这种情况，我该如何解决？

已解决！我只是用别的代理软件，然后OK！

【问题讨论】：

Python Requests getting ('Connection aborted.', BadStatusLine("''",)) error的可能重复
@MarcoFerrari 编辑得很好，但是代码中的这些 cmets 是从哪里来的？
@M4rtini，感谢您的编辑，但问题的答案并没有解决我的问题。

标签： python python-2.7 web-crawler python-requests

【解决方案1】：

无法解析这些域的连接，对 url 执行正常 ping 操作会产生此结果

命令运行：

ping http://download.ithome.com.tw

结果

The host could not be resolved

没有响应，因此没有在正常情况下包含状态代码的状态行。

【讨论】：

感谢您的回答！但我用“ithome.com.tw”（可以从我的 python 爬虫访问）进行 ping 测试，但出现相同的错误。
你在not ok部分提到的url说：weekly.ithome.com.tw但现在你提到了ithome.com.tw，后者打开但前者没有

【解决方案2】：

我发现使用 urllib2 库比使用请求更好。

import urllib2
def get_page(url):
  request = urllib2.Request(url)
  request = urllib2.urlopen(request)
  data = request.read()
  return data
url = "http://blog.ithome.com.tw/"
print get_page(url)

祝你有美好的一天。

【讨论】：

感谢您的回答！但我用“ithelp.ithome.com.tw”对其进行了测试，但出现了类似的错误：httplib.BadStatusLine: ''