【发布时间】:2011-08-10 14:12:36
【问题描述】:
在 Python 中,我可以使用 urllib2(和 urllib)打开外部 URL,例如 Google。但是,我在打开本地主机 URL 时遇到了问题。我有一个 python SimpleHTTPServer 在端口 8280 上运行,我可以使用 http://localhost:8280/ 成功浏览它。
python -m SimpleHTTPServer 8280
另外值得注意的是,我正在运行 Ubuntu,它运行 CNTLM 来处理对我们公司 Web 代理的身份验证。因此,wget 实际上也不能与 localhost 一起使用,所以我认为这不是 urllib 问题!
测试脚本(test_urllib2.py):
import urllib2
print "Opening Google..."
google = urllib2.urlopen("http://www.google.com/")
print google.read(100)
print "Google opened."
print "Opening localhost..."
localhost = urllib2.urlopen("http://localhost:8280/")
print localhost.read(100)
print "localhost opened."
输出:
$ ./test_urllib2.py
Opening Google...
<!doctype html><html><head><meta http-equiv="content-type" content="text/html; charset=ISO-8859-1"><
Google opened.
Opening localhost...
Traceback (most recent call last):
File "./test_urllib2.py", line 10, in <module>
localhost = urllib2.urlopen("http://localhost:8280/")
File "/usr/lib/python2.6/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.6/urllib2.py", line 397, in open
response = meth(req, response)
File "/usr/lib/python2.6/urllib2.py", line 510, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.6/urllib2.py", line 429, in error
result = self._call_chain(*args)
File "/usr/lib/python2.6/urllib2.py", line 369, in _call_chain
result = func(*args)
File "/usr/lib/python2.6/urllib2.py", line 605, in http_error_302
return self.parent.open(new, timeout=req.timeout)
File "/usr/lib/python2.6/urllib2.py", line 391, in open
response = self._open(req, data)
File "/usr/lib/python2.6/urllib2.py", line 409, in _open
'_open', req)
File "/usr/lib/python2.6/urllib2.py", line 369, in _call_chain
result = func(*args)
File "/usr/lib/python2.6/urllib2.py", line 1161, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/lib/python2.6/urllib2.py", line 1134, in do_open
r = h.getresponse()
File "/usr/lib/python2.6/httplib.py", line 986, in getresponse
response.begin()
File "/usr/lib/python2.6/httplib.py", line 391, in begin
version, status, reason = self._read_status()
File "/usr/lib/python2.6/httplib.py", line 355, in _read_status
raise BadStatusLine(line)
httplib.BadStatusLine
解决方案:问题确实是因为我在我们的公司网络代理后面使用 CNTLM(我无法确定这导致问题的具体原因)。解决方案是使用 ProxyHandler:
proxy_support = urllib2.ProxyHandler({})
opener = urllib2.build_opener(proxy_support)
print opener.open("http://localhost:8380/").read(100)
感谢 loki2302 指点我here。
【问题讨论】:
-
不要在没有异常的情况下使用
except:,请向我们展示urllib2.urlopen引发的异常。 -
BadStatusLine异常表明来自服务器的响应标头格式错误。您能看一下返回的内容吗?