【问题标题】:Redirect with no auth没有身份验证的重定向
【发布时间】:2018-01-12 01:01:31
【问题描述】:

根据docs,应该很简单:

data = self.http_pool.urlopen('GET', file_url,
                              preload_content=False,
                              retries=max_download_retries) 



request.add_unredirected_header(key, header)
Add a header that will not be added to a redirected request.

但我似乎找不到任何关于如何实现这一点的示例。

我正在使用 pyupdater 从 bitbucket 下载更新并启动最新版本的 exe。我正在使用这个库创建一个可以很好地连接到 bitbucket 的脚本,但后来它使用nauthorization: Basic <redacted>\r\n\r\n(这是 bitbucket auth)重定向到亚马逊,这意味着我得到了'HTTP/1.1 400 Bad Request\r\n'。亚马逊不支持基本身份验证。这应该很容易解决,但我在这个问题上找不到太多。

here 提出的解决方案需要手动重新创建每个重定向的请求。如果我必须为我上传的新文件执行此操作,这将成为一个不断增长的列表并且很快变得乏味。它也不会继续脚本的其余部分,而是下载到同一目录。

因为这就是 Pyupdater 处理下载的方式,所以issue 可能会得到解决。

downloader.py 的第 366 行:

data = self.http_pool.urlopen('GET', file_url,
                                              preload_content=False,
                                              retries=max_download_retries)

关于如何解决此问题的任何想法,使其不再产生此错误。

完全错误(ctrl f -> 400):

Python main.py
DEBUG:root:Version - 2.5.1
DEBUG:pyupdater.client:PyUpdater Version 2.5.1
Current version is  1.3
{'authorization': 'Basic <redacted>'}
DEBUG:pyupdater.client:Setting up directories...
DEBUG:pyupdater.client:Downloading key file
DEBUG:pyupdater.client.downloader:Url for request: https://api.bitbucket.org/2.0/repositories/ brofewfefwefewef/eee/downloads/keys.gz
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.bitbucket.org
send: b'GET /2.0/repositories/ brofewfefwefewef/eee/downloads/keys.gz HTTP/1.1\r\nHost: api.bitbucket.org\r\nAccept-Encoding: identity\r\nauthorization: Basic <redacted>\r\n\r\n'
reply: 'HTTP/1.1 302 Found\r\n'
DEBUG:urllib3.connectionpool:https://api.bitbucket.org:443 "GET /2.0/repositories/brofewfefwefewef/eee/downloads/keys.gz HTTP/1.1" 302 0
DEBUG:urllib3.util.retry:Incremented Retry for (url='https://api.bitbucket.org/2.0/repositories/brofewfefwefewef/eee/downloads/keys.gz'): Retry(total=2, connect=None, read=None, redirect=None, status=None)
INFO:urllib3.poolmanager:Redirecting https://api.bitbucket.org/2.0/repositories/ brofewfefwefewef/eee/downloads/keys.gz -> https://bbuseruploads.s3.amazonaws.com/a0e395b6-0c54-4efb-9074-57ec4190020b/downloads/3fc0be6d-ca69-42d3-9711-fbb5cfd2bc38/keys.gz?Signature=<redacted>&Expires=1515976464&AWSAccessKeyId=<redacted>&versionId=n.ymY11KRkq36Xozy25aChvfUT.YzTf5&response-content-disposition=attachment%3B%20filename%3D%22keys.gz%22
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): bbuseruploads.s3.amazonaws.com
header: Server header: Vary header: Content-Type header: X-OAuth-Scopes header: Strict-Transport-Security header: Date header: Location header: X-Served-By header: ETag header: X-Static-Version header: X-Content-Type-Options header: X-Accepted-OAuth-Scopes header: X-Credential-Type header: X-Render-Time header: Connection header: X-Request-Count header: X-Frame-Options header: X-Version header: Content-Length send: b'GET /a0e395b6-0c54-4efb-9074-57ec4190020b/downloads/3fc0be6d-ca69-42d3-9711-fbb5cfd2bc38/keys.gz?Signature=<redacted>&Expires=1515976464&AWSAccessKeyId=<redacted>&versionId=n.ymY11KRkq36Xozy25aChvfUT.YzTf5&response-content-disposition=attachment%3B%20filename%3D%22keys.gz%22 HTTP/1.1\r\nHost: bbuseruploads.s3.amazonaws.com\r\nAccept-Encoding: identity\r\nauthorization: Basic <redacted>\r\n\r\n'
reply: 'HTTP/1.1 400 Bad Request\r\n'
DEBUG:urllib3.connectionpool:https://bbuseruploads.s3.amazonaws.com:443 "GET /a0e395b6-0c54-4efb-9074-57ec4190020b/downloads/3fc0be6d-ca69-42d3-9711-fbb5cfd2bc38/keys.gz?Signature=<redacted>&Expires=1515976464&AWSAccessKeyId=<redacted>&versionId=n.ymY11KRkq36Xozy25aChvfUT.YzTf5&response-content-disposition=attachment%3B%20filename%3D%22keys.gz%22 HTTP/1.1" 400 None
DEBUG:pyupdater.client.downloader:Resource URL: https://api.bitbucket.org/2.0/repositories/brofewfefwefewef/eee/downloads/keys.gz
DEBUG:pyupdater.client.downloader:Got content length of: None
DEBUG:pyupdater.client.downloader:Content-Length not in headers
DEBUG:pyupdater.client.downloader:Callbacks will not show time left or percent downloaded.
DEBUG:pyupdater.client.downloader:Using file as storage since the file is too large
DEBUG:pyupdater.client.downloader:Block size: 1036
DEBUG:pyupdater.client.downloader:{'total': None, 'downloaded': 519, 'status': 'downloading', 'percent_complete': '-.-%', 'time': '--:--'}
DEBUG:pyupdater.client.downloader:{'total': None, 'downloaded': 519, 'status': 'finished', 'percent_complete': '-.-%', 'time': '00:00'}
DEBUG:pyupdater.client.downloader:Download Complete
DEBUG:pyupdater.client.downloader:No hash to verify
WARNING:pyupdater.client.downloader:Downloaded file is very large, reading it in to memory may crash the app
DEBUG:pyupdater.client:Failed to decompress gzip file
DEBUG:pyupdater.client:Version file download failed
header: x-amz-request-id header: x-amz-id-2 header: Content-Type header: Transfer-Encoding header: Date header: Connection header: Server {'authorization': 'Basic <redacted>'}
DEBUG:pyupdater.client:Not a gzipped file (b'<?')
Traceback (most recent call last):
  File "C:\Users\Django\AppData\Local\Continuum\miniconda3\lib\site-packages\pyupdater\client\__init__.py", line 440, in _get_key_data
    decompressed_data = _gzip_decompress(data)
  File "C:\Users\Django\AppData\Local\Continuum\miniconda3\lib\site-packages\dsdev_utils\helpers.py", line 58, in gzip_decompress
    data = decompressed_file.read()
  File "C:\Users\Django\AppData\Local\Continuum\miniconda3\Lib\gzip.py", line 276, in read
    return self._buffer.read(size)
  File "C:\Users\Django\AppData\Local\Continuum\miniconda3\Lib\gzip.py", line 463, in read
    if not self._read_gzip_header():
  File "C:\Users\Django\AppData\Local\Continuum\miniconda3\Lib\gzip.py", line 411, in _read_gzip_header
    raise OSError('Not a gzipped file (%r)' % magic)
OSError: Not a gzipped file (b'<?')
DEBUG:pyupdater.client:Loading version file...
DEBUG:pyupdater.client:Downloading online version file
DEBUG:pyupdater.client.downloader:Url for request: https://api.bitbucket.org/2.0/repositories/ brofewfefwefewef/eee/downloads/versions.gz
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.bitbucket.org
send: b'GET /2.0/repositories/ brofewfefwefewef/eee/downloads/versions.gz HTTP/1.1\r\nHost: api.bitbucket.org\r\nAccept-Encoding: identity\r\nauthorization: Basic <redacted>\r\n\r\n'
reply: 'HTTP/1.1 302 Found\r\n'
DEBUG:urllib3.connectionpool:https://api.bitbucket.org:443 "GET /2.0/repositories/brofewfefwefewef/eee/downloads/versions.gz HTTP/1.1" 302 0
DEBUG:urllib3.util.retry:Incremented Retry for (url='https://api.bitbucket.org/2.0/repositories/brofewfefwefewef/eee/downloads/versions.gz'): Retry(total=2, connect=None, read=None, redirect=None, status=None)
INFO:urllib3.poolmanager:Redirecting https://api.bitbucket.org/2.0/repositories/brofewfefwefewef/eee/downloads/versions.gz -> https://bbuseruploads.s3.amazonaws.com/a0e395b6-0c54-4efb-9074-57ec4190020b/downloads/0b04c4a8-dd59-49d2-9cd7-95d22379a5e6/versions.gz?Signature=<redacted>&Expires=1515976465&AWSAccessKeyId=<redacted>&versionId=jLhOcIbVAU4xRghD3kB2NfB4iLqUr7PM&response-content-disposition=attachment%3B%20filename%3D%22versions.gz%22
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): bbuseruploads.s3.amazonaws.com
header: Server header: Vary header: Content-Type header: X-OAuth-Scopes header: Strict-Transport-Security header: Date header: Location header: X-Served-By header: ETag header: X-Static-Version header: X-Content-Type-Options header: X-Accepted-OAuth-Scopes header: X-Credential-Type header: X-Render-Time header: Connection header: X-Request-Count header: X-Frame-Options header: X-Version header: Content-Length send: b'GET /a0e395b6-0c54-4efb-9074-57ec4190020b/downloads/0b04c4a8-dd59-49d2-9cd7-95d22379a5e6/versions.gz?Signature=<redacted>&Expires=1515976465&AWSAccessKeyId=<redacted>&versionId=jLhOcIbVAU4xRghD3kB2NfB4iLqUr7PM&response-content-disposition=attachment%3B%20filename%3D%22versions.gz%22 HTTP/1.1\r\nHost: bbuseruploads.s3.amazonaws.com\r\nAccept-Encoding: identity\r\nauthorization: Basic <redacted>\r\n\r\n'
DEBUG:urllib3.connectionpool:https://bbuseruploads.s3.amazonaws.com:443 "GET /a0e395b6-0c54-4efb-9074-57ec4190020b/downloads/0b04c4a8-dd59-49d2-9cd7-95d22379a5e6/versions.gz?Signature=<redacted>&Expires=1515976465&AWSAccessKeyId=<redacted>&versionId=jLhOcIbVAU4xRghD3kB2NfB4iLqUr7PM&response-content-disposition=attachment%3B%20filename%3D%22versions.gz%22 HTTP/1.1" 400 None
reply: 'HTTP/1.1 400 Bad Request\r\n'
DEBUG:pyupdater.client.downloader:Resource URL: https://api.bitbucket.org/2.0/repositories/brofewfefwefewef/eee/downloads/versions.gz
DEBUG:pyupdater.client.downloader:Got content length of: None
DEBUG:pyupdater.client.downloader:Content-Length not in headers
DEBUG:pyupdater.client.downloader:Callbacks will not show time left or percent downloaded.
DEBUG:pyupdater.client.downloader:Using file as storage since the file is too large
DEBUG:pyupdater.client.downloader:Block size: 1036
DEBUG:pyupdater.client.downloader:{'total': None, 'downloaded': 519, 'status': 'downloading', 'percent_complete': '-.-%', 'time': '--:--'}
DEBUG:pyupdater.client.downloader:{'total': None, 'downloaded': 519, 'status': 'finished', 'percent_complete': '-.-%', 'time': '00:00'}
DEBUG:pyupdater.client.downloader:Download Complete
DEBUG:pyupdater.client.downloader:No hash to verify
WARNING:pyupdater.client.downloader:Downloaded file is very large, reading it in to memory may crash the app
DEBUG:pyupdater.client:Failed to decompress gzip file
DEBUG:pyupdater.client:Version file download failed
DEBUG:pyupdater.client:Not a gzipped file (b'<?')
Traceback (most recent call last):
  File "C:\Users\Django\AppData\Local\Continuum\miniconda3\lib\site-packages\pyupdater\client\__init__.py", line 417, in _get_manifest_from_http
    decompressed_data = _gzip_decompress(data)
  File "C:\Users\Django\AppData\Local\Continuum\miniconda3\lib\site-packages\dsdev_utils\helpers.py", line 58, in gzip_decompress
    data = decompressed_file.read()
  File "C:\Users\Django\AppData\Local\Continuum\miniconda3\Lib\gzip.py", line 276, in read
    return self._buffer.read(size)
  File "C:\Users\Django\AppData\Local\Continuum\miniconda3\Lib\gzip.py", line 463, in read
    if not self._read_gzip_header():
  File "C:\Users\Django\AppData\Local\Continuum\miniconda3\Lib\gzip.py", line 411, in _read_gzip_header
    raise OSError('Not a gzipped file (%r)' % magic)
OSError: Not a gzipped file (b'<?')
DEBUG:dsdev_utils.paths:Changing to Directory --> C:\Users\Django\AppData\Local\any\main
DEBUG:pyupdater.client:Found version file on file system
DEBUG:pyupdater.client:Loaded version file from file system
DEBUG:dsdev_utils.paths:Moving back to Directory --> C:\Users\Django\privacy 4
DEBUG:pyupdater.client:Data type: <class 'bytes'>
DEBUG:pyupdater.client:App key is None
DEBUG:pyupdater.client:Version Data:
{'latest': {'main': {'stable': {'win': '1.4.0.2.0'}}}, 'updates': {'main': {'1.3.0.2.0': {'win': {'file_hash': '807c743b8c29f0053f4f9d9e6a8895b0e037f77480e7065c1470c2aba1cb08a0', 'file_size': 12194381, 'filename': 'main-win-1.3.zip', 'patch_hash': '29fec1006c2736eb78cc859f89e165af942daae6d9ac994a1a686d9b7b418ef6', 'patch_name': 'main-win-5', 'patch_size': 147}}, '1.4.0.2.0': {'win': {'file_hash': 'd59a22a95229f0a9c64909c646bfba31daf6bf8689dc16c9c93180c1602e9d3c', 'file_size': 12195571, 'filename': 'main-win-1.4.zip', 'patch_hash': 'baf3eba3a4b3184919ed9e57c3e8be9494a50862b40b1590ecb64e39e71a4ce3', 'patch_name': 'main-win-6', 'patch_size': 479625}}}}, 'signature': '<redacted>'}
DEBUG:dsdev_utils.helpers:Version str: 1.3
DEBUG:pyupdater.client:Failed version file verification

对于那些想自己复制错误的人,我已经编写了我采取的步骤exactly

【问题讨论】:

  • curl 也存在同样的问题,即使我尝试使用 curl 的 URL,curl 也会将标头发送到下一个请求
  • @TarunLalwani 关于如何删除标头中的基本身份验证以进行重定向有什么想法吗?其他一切都很好。该身份验证对 bitbucket 而非亚马逊有用
  • 是的,发布了相同的答案。试一试

标签: python amazon-s3 python-requests bitbucket urllib


【解决方案1】:

Edit-1:

您需要为您的main.py 使用以下代码,而无需对downloader.py 进行任何更改

from __future__ import print_function
import urllib3.poolmanager

orig_urlopen = urllib3.poolmanager.PoolManager.urlopen


def new_urlopen(self, method, url, redirect=True, **kw):
    if "s3.amazonaws.com" in url and 'authorization' in self.headers:
        self.headers.pop('authorization')
    return orig_urlopen(self, method, url, redirect, **kw)


urllib3.poolmanager.PoolManager.urlopen = new_urlopen


import logging

from selenium import webdriver

logging.basicConfig(level=logging.DEBUG)
from client_config import ClientConfig
from pyupdater.client import Client, AppUpdate

import http.client as http_client

http_client.HTTPConnection.debuglevel = 1


def check_for_update():
    client = Client(ClientConfig(), refresh=True, headers={'basic_auth': '<username>:<password>'})
    app_update = client.update_check(ClientConfig.APP_NAME, ClientConfig.APP_VERSION, channel='stable')
    if app_update is not None:
        if app_update.download():
            if isinstance(app_update, AppUpdate):
                app_update.extract_restart()
                return True
            else:
                app_update.extract()
                return True
    return False


def main():
    print('Current version is ', ClientConfig.APP_VERSION)
    if check_for_update():
        print('there\'s a new update :D')
    # driver = webdriver.Firefox()
    # driver.get('http://stackoverflow.com')


if __name__ == "__main__":
    main()

原始答案 您需要为此使用猴子补丁。下面的补丁应该可以完成这项工作

import urllib3.poolmanager

orig_urlopen = urllib3.poolmanager.PoolManager.urlopen


def new_urlopen(self, method, url, redirect=True, **kw):
    if "s3.amazonaws.com" in url and 'Authorization' in self.headers:
        self.headers.pop('Authorization')
    return orig_urlopen(self, method, url, redirect, **kw)


urllib3.poolmanager.PoolManager.urlopen = new_urlopen

上面的补丁对我有用的示例测试

import urllib3

pool = urllib3.PoolManager()

pool.headers.update({'Authorization': 'Basic XYZ=='})
r = pool.urlopen('GET', 'https://api.bitbucket.org/2.0/repositories/brofewfefwefewef/eee/downloads/keys.gz')
print(r.data)

需要在import pyupdater之前执行代码

【讨论】:

  • 我必须发布一个 pastebin。添加猴子补丁后, data = self.http_pool.urlopen('GET', file_url, preload_content=False, retries=max_download_retries) 会是什么样子。下载单个文件是可行的,但我需要它来更新 exe
  • 能否在monkey patch函数中下断点,看看是否执行
  • 我可以尝试添加断点,但在此之前我还没有这样做过
  • 你能贴出你用过的完整代码吗?查看您的日志,标题问题已解决,但您还有另一个问题
  • 当然。我已经发布了 main.py 以及 downloader.py 和错误pastebin.com/SeLeUZNv
【解决方案2】:

我刚查了一下,相信是pyupdater的问题(不知道是什么,没用过)。

似乎假设所有响应的正文都将在 GZIP 中压缩。我找不到可以阻止这种假设的标志。实际内容实际上根本没有压缩。

以下是 pyupdater 的一些相关代码:

pyupdater/client/__init__.py:

def _get_manifest_from_http(self):
    log.debug('Downloading online version file')
    try:
        fd = _FD(self.version_file, self.update_urls, verify=self.verify,
                 urllb3_headers=self.urllib3_headers)
        data = fd.download_verify_return()
        try:
            import ipdb
            ipdb.set_trace()
            decompressed_data = _gzip_decompress(data)
        except IOError:
            log.debug('Failed to decompress gzip file')
            # Will be caught down below.
            # Just logging the error
            raise
        log.debug('Version file download successful')
        # Writing version file to application data directory
        self._write_manifest_2_filesystem(decompressed_data)
        return decompressed_data
    except Exception as err:
        log.debug('Version file download failed')
        log.debug(err, exc_info=True)
        return None

这是我收到的数据示例:

ipdb> data
b'{"type": "error", "error": {"message": "keys.gz"}}'

我相信你应该在https://github.com/JMSwag/PyUpdater 上开一张票,看看他们是否可以进一步帮助你。

【讨论】:

    【解决方案3】:

    Requests 是一个非常棒的库,不要浪费时间在其他任何事情上,除非有很好的理由:

    import requests
    import zlib
    
    def download(url, username, password):
        r = requests.get(url, auth=requests.auth.HTTPBasicAuth(username, password))
        r.raise_for_status()
        return zlib.decompress(r.content, 15 + 32)
    
    download('https://api.bitbucket.org/2.0/repositories/brofewfefwefewef/eee/downloads/keys.gz', 'brofewfefwefewef', your_password)
    

    另外,值得注意的是,这里的凭据不应再使用。基本身份验证可以非常简单地解码。

    【讨论】:

    • 值得注意的是,pyupdater 在底层使用请求。
    • 无法创建资源 url,并且需要像对象这样的字节,而不是“无类型”。错误 400 已经消失了:S
    猜你喜欢
    • 1970-01-01
    • 2013-08-21
    • 1970-01-01
    • 2012-05-01
    • 2017-05-23
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2015-12-18
    相关资源
    最近更新 更多