【发布时间】:2019-05-26 15:55:58
【问题描述】:
我正在尝试抓取一个页面但没有成功:
>> scrapy shell "XXXXXX"
...
2018-12-28 17:23:32 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET XXXXXXXX> (failed 1 times): [<twisted.python.failure.Failure OpenSSL.SSL.Error: [('SSL routines', 'ssl3_read_bytes', 'sslv3 alert handshake failure')]>]
2018-12-28 17:23:32 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET XXXXXXXXXXXXX> (failed 2 times): [<twisted.python.failure.Failure OpenSSL.SSL.Error: [('SSL routines', 'ssl3_read_bytes', 'sslv3 alert handshake failure')]>]
2018-12-28 17:23:33 [scrapy.downloadermiddlewares.retry] DEBUG: Gave up retrying <GET XXXXXXXXXXXXXXXXX> (failed 3 times): [<twisted.python.failure.Failure OpenSSL.SSL.Error: [('SSL routines', 'ssl3_read_bytes', 'sslv3 alert handshake failure')]>]
Traceback (most recent call last):
File "/home/joaquin/Repos/extruct/env/bin/scrapy", line 11, in <module>
sys.exit(execute())
File "/home/joaquin/Repos/extruct/env/lib/python3.7/site-packages/scrapy/cmdline.py", line 150, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
File "/home/joaquin/Repos/extruct/env/lib/python3.7/site-packages/scrapy/cmdline.py", line 90, in _run_print_help
func(*a, **kw)
File "/home/joaquin/Repos/extruct/env/lib/python3.7/site-packages/scrapy/cmdline.py", line 157, in _run_command
cmd.run(args, opts)
File "/home/joaquin/Repos/extruct/env/lib/python3.7/site-packages/scrapy/commands/shell.py", line 73, in run
shell.start(url=url, redirect=not opts.no_redirect)
File "/home/joaquin/Repos/extruct/env/lib/python3.7/site-packages/scrapy/shell.py", line 48, in start
self.fetch(url, spider, redirect=redirect)
File "/home/joaquin/Repos/extruct/env/lib/python3.7/site-packages/scrapy/shell.py", line 115, in fetch
reactor, self._schedule, request, spider)
File "/home/joaquin/Repos/extruct/env/lib/python3.7/site-packages/twisted/internet/threads.py", line 122, in blockingCallFromThread
result.raiseException()
File "/home/joaquin/Repos/extruct/env/lib/python3.7/site-packages/twisted/python/failure.py", line 467, in raiseException
raise self.value.with_traceback(self.tb)
twisted.web._newclient.ResponseNeverReceived: [<twisted.python.failure.Failure OpenSSL.SSL.Error: [('SSL routines', 'ssl3_read_bytes', 'sslv3 alert handshake failure')]>]
当我尝试 SSL 连接时:
>> openssl s_client XXXXX.XXXX.XXXX:443
CONNECTED(00000003)
140087350686208:error:14094410:SSL routines:ssl3_read_bytes:sslv3 alert handshake failure:ssl/record/rec_layer_s3.c:1528:SSL alert number 40
---
no peer certificate available
---
No client certificate CA names sent
---
SSL handshake has read 7 bytes and written 323 bytes
Verification: OK
---
New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
Early data was not sent
Verify return code: 0 (ok)
---
当我使用curl 尝试此页面时,也会发生同样的情况:
curl -i XXXX.XXXX.XXXX
curl: (35) error:14094410:SSL routines:ssl3_read_bytes:sslv3 alert handshake failure
我尝试在openssl 中指定-servername,但这并不能解决问题。也尝试指定-tls1_2 不起作用。 TLS 信息:
更新
>> openssl version
OpenSSL 1.1.1a 20 Nov 2018
【问题讨论】:
-
正如所写的 RSA/3DES_EDE_CBC/HMAC-SHA1 已过时,虽然您的浏览器可能足够宽松以接受它们,但您的 openssl 版本似乎并非如此。您至少需要告知您的 openssl 版本,如果您使用
openssl ciphers,您将查看是否有3DES-EDE-CBC。或者尝试在线扫描仪,如ssllabs.com/ssltest -
我更新了我的问题,它的 OpenSSL 1.1.1a
-
OpenSSL 自 1.1.0 起不再包含 或 启用“弱”密码套件,默认情况下定义为 3DES 和 RC4; 如果您的构建是使用 'enable-weak-ssl-ciphers'
s_client ... -cipher DES-CBC3-SHA完成的。 @PatrickMevzek+ 当它们受支持时,对于大多数 3DES 套件,OpenSSL 使用右侧的 3 的“交换”名称:DES-CBC3-SHA ECDHE-RSA-DES-CBC3-SHA 等。 -
@dave_thompson_085 FWIW 确实我的
openssl ciphers仍然显示一些带有 3 作为前缀的名称,例如PSK-3DES-EDE-CBC-SHA -
@PatrickMevzek:这就是我使用“最”的原因
标签: ssl curl web-scraping openssl scrapy