Crawlera：407“错误的身份验证”错误消息答案

【问题标题】：Crawlera: 407 "Bad Auth" error messageCrawlera：407“错误的身份验证”错误消息
【发布时间】：2018-03-26 11:28:09
【问题描述】：

将 Crawlera 的示例代码用于带有代理的 GET 请求。

import requests

url = "http://httpbin.org/ip"
proxy_host = "proxy.crawlera.com"
proxy_port = "8010"
proxy_auth = "<APIKEY>:" # Make sure to include ':' at the end
proxies = {
      "https": "https://{}@{}:{}/".format(proxy_auth, proxy_host, proxy_port),
      "http": "http://{}@{}:{}/".format(proxy_auth, proxy_host, proxy_port)
}

r = requests.get(url, proxies=proxies, verify=False)

我收到 407 Bad Proxy Auth 错误。我已经三倍检查 API_KEY 是否正确。

响应标头：

{
   'Proxy-Connection': 'close',
   'Proxy-Authenticate': 'Basic realm="Crawlera"',
   'Transfer-Encoding': 'chunked',
   'Connection': 'close',
   'Date': 'Mon, 26 Mar 2018 11:18:05 GMT',
   'X-Crawlera-Error': 'bad_proxy_auth',
   'X-Crawlera-Version': '1.32.0-07c786'
}

请求已更新。

$ pip freeze |grep requests
requests==2.8.1

【问题讨论】：

标签： python python-requests scrapinghub

【解决方案1】：

我设法通过添加 Proxy-Authorization 标头使其工作。

proxy_auth  = "<APIKEY>:"
headers = {
   # other headers ...
   "Proxy-Authorization": 'Basic ' + base64.b64encode(proxy_auth)
}

proxies = {
      "https": "https://{}:{}/".format(proxy_host, proxy_port),
      "http": "http://{}:{}/".format(proxy_host, proxy_port)
}


r = requests.get(url, headers=headers, proxies=proxies, verify=False)

【讨论】：

不起作用，为base64.b64encode(proxy_auth) 代码位提供错误消息TypeError: a bytes-like object is required, not 'str'

【解决方案2】：

如果你想保留'crawlera'方式，你可以尝试升级你的请求客户端：

pip install requests --upgrade

我遇到了同样的问题，您的解决方案有效，但经过进一步搜索，我发现this解决方案：

升级到requests 客户端到2.19...对我有用，我可以继续使用 Crawlera 示例脚本。

【讨论】：