【问题标题】:python all requests via proxy failpython通过代理的所有请求都失败
【发布时间】:2021-09-09 10:19:55
【问题描述】:

我真的不知道错误的原因是什么......

我正在尝试从通过代理连接到该站点的站点获取一些信息。所有程序化请求都会失败,而 curl 对应的请求则不会。

这是一段实际的请求代码:

            proxies = {"https": next_ip, "http": next_ip}
            logger.debug(proxies)
            try:
                result: requests.Response = requests.get(
                    url, *args, proxies=proxies, timeout=3, **kwargs
                )

args 和 kwargs 是空的...

对应的日志输出:

r4h.infrastructure.adapter.free_proxy_scrapper:request:67 - {'https': 'https://213.14.32.73:9090', 'http': 'https://213.14.32.73:9090'}
r4h.infrastructure.adapter.free_proxy_scrapper:request:79 - failed for https://213.14.32.73:9090. due to HTTPSConnectionPool(host='www.somehost.com', port=443): Max retries exceeded with url: /inner/site/url?date=3day (Caused by ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f09572275e0>: Failed to establish a new connection: [Errno 111] Connection refused')))
r4h.infrastructure.adapter.free_proxy_scrapper:request:67 - {'https': 'socks4://88.248.14.51:5678', 'http': 'socks4://88.248.14.51:5678'}
r4h.infrastructure.adapter.free_proxy_scrapper:request:79 - failed for socks4://88.248.14.51:5678. due to SOCKSHTTPSConnectionPool(host='www.somehost.com', port=443): Read timed out. (read timeout=3)
r4h.infrastructure.adapter.free_proxy_scrapper:request:67 - {'https': 'http://176.235.131.232:9090', 'http': 'http://176.235.131.232:9090'}
r4h.infrastructure.adapter.free_proxy_scrapper:request:79 - failed for http://176.235.131.232:9090. due to HTTPSConnectionPool(host='www.somehost.com', port=443): Max retries exceeded with url: /inner/site/url?date=3day (Caused by ProxyError('Cannot connect to proxy.', timeout('timed out')))
r4h.infrastructure.adapter.free_proxy_scrapper:request:67 - {'https': 'socks5://188.132.241.162:56109', 'http': 'socks5://188.132.241.162:56109'}

我采用了有错误的随机代理并尝试使用curl - 它一直在工作:-/

我的包的版本

$ pipenv run pip show requests pysocks

Name: requests
Version: 2.26.0
Summary: Python HTTP for Humans.
Home-page: https://requests.readthedocs.io
Author: Kenneth Reitz
Author-email: me@kennethreitz.org
License: Apache 2.0
Location: somepath.venv/lib/python3.9/site-packages
Requires: charset-normalizer, urllib3, certifi, idna
Required-by: infi.clickhouse-orm, clickhouse-sqlalchemy
---
Name: PySocks
Version: 1.7.1
Summary: A Python SOCKS client module. See https://github.com/Anorov/PySocks for more information.
Home-page: https://github.com/Anorov/PySocks
Author: Anorov
Author-email: anorov.vorona@gmail.com
License: BSD
Location: somepath.venv/lib/python3.9/site-packages
Requires: 
Required-by: 

【问题讨论】:

  • 网站有没有可能只是阻止代理?
  • 那么 curl 也不能正常工作

标签: python proxy python-requests


【解决方案1】:

与往常一样,原因很简单——目标站点拒绝没有用户代理标头的连接...

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2012-06-14
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2012-04-17
    • 1970-01-01
    • 2021-03-06
    相关资源
    最近更新 更多