【发布时间】:2021-09-09 10:19:55
【问题描述】:
我真的不知道错误的原因是什么......
我正在尝试从通过代理连接到该站点的站点获取一些信息。所有程序化请求都会失败,而 curl 对应的请求则不会。
这是一段实际的请求代码:
proxies = {"https": next_ip, "http": next_ip}
logger.debug(proxies)
try:
result: requests.Response = requests.get(
url, *args, proxies=proxies, timeout=3, **kwargs
)
args 和 kwargs 是空的...
对应的日志输出:
r4h.infrastructure.adapter.free_proxy_scrapper:request:67 - {'https': 'https://213.14.32.73:9090', 'http': 'https://213.14.32.73:9090'}
r4h.infrastructure.adapter.free_proxy_scrapper:request:79 - failed for https://213.14.32.73:9090. due to HTTPSConnectionPool(host='www.somehost.com', port=443): Max retries exceeded with url: /inner/site/url?date=3day (Caused by ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f09572275e0>: Failed to establish a new connection: [Errno 111] Connection refused')))
r4h.infrastructure.adapter.free_proxy_scrapper:request:67 - {'https': 'socks4://88.248.14.51:5678', 'http': 'socks4://88.248.14.51:5678'}
r4h.infrastructure.adapter.free_proxy_scrapper:request:79 - failed for socks4://88.248.14.51:5678. due to SOCKSHTTPSConnectionPool(host='www.somehost.com', port=443): Read timed out. (read timeout=3)
r4h.infrastructure.adapter.free_proxy_scrapper:request:67 - {'https': 'http://176.235.131.232:9090', 'http': 'http://176.235.131.232:9090'}
r4h.infrastructure.adapter.free_proxy_scrapper:request:79 - failed for http://176.235.131.232:9090. due to HTTPSConnectionPool(host='www.somehost.com', port=443): Max retries exceeded with url: /inner/site/url?date=3day (Caused by ProxyError('Cannot connect to proxy.', timeout('timed out')))
r4h.infrastructure.adapter.free_proxy_scrapper:request:67 - {'https': 'socks5://188.132.241.162:56109', 'http': 'socks5://188.132.241.162:56109'}
我采用了有错误的随机代理并尝试使用curl - 它一直在工作:-/
我的包的版本
$ pipenv run pip show requests pysocks
Name: requests
Version: 2.26.0
Summary: Python HTTP for Humans.
Home-page: https://requests.readthedocs.io
Author: Kenneth Reitz
Author-email: me@kennethreitz.org
License: Apache 2.0
Location: somepath.venv/lib/python3.9/site-packages
Requires: charset-normalizer, urllib3, certifi, idna
Required-by: infi.clickhouse-orm, clickhouse-sqlalchemy
---
Name: PySocks
Version: 1.7.1
Summary: A Python SOCKS client module. See https://github.com/Anorov/PySocks for more information.
Home-page: https://github.com/Anorov/PySocks
Author: Anorov
Author-email: anorov.vorona@gmail.com
License: BSD
Location: somepath.venv/lib/python3.9/site-packages
Requires:
Required-by:
【问题讨论】:
-
网站有没有可能只是阻止代理?
-
那么 curl 也不能正常工作
标签: python proxy python-requests