【问题标题】:Error extracting an archive using `tarfile`使用“tarfile”提取档案时出错
【发布时间】:2019-09-11 14:46:27
【问题描述】:

我在尝试使用 the tarfile library 提取 .tar.gz 存档时遇到错误。

这里是相关代码sn-p:

# `gzip_archive_bytes_content` is the content of the gzip archive, in "bytes" format
repo_sources_file_object = io.BytesIO(gzip_archive_bytes_content)
repo_sources_tar_object = tarfile.TarFile(fileobj=repo_sources_file_object)
repo_sources_tar_object.extractall(path="/tmp/")

这是我得到的错误:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/tarfile.py", line 186, in nti
    s = nts(s, "ascii", "strict")
  File "/usr/local/lib/python3.7/tarfile.py", line 170, in nts
    return s.decode(encoding, errors)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x9a in position 1: ordinal not in range(128)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/tarfile.py", line 2289, in next
    tarinfo = self.tarinfo.fromtarfile(self)
  File "/usr/local/lib/python3.7/tarfile.py", line 1095, in fromtarfile
    obj = cls.frombuf(buf, tarfile.encoding, tarfile.errors)
  File "/usr/local/lib/python3.7/tarfile.py", line 1037, in frombuf
    chksum = nti(buf[148:156])
  File "/usr/local/lib/python3.7/tarfile.py", line 189, in nti
    raise InvalidHeaderError("invalid header")
tarfile.InvalidHeaderError: invalid header

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.7/site-packages/my-package/__main__.py", line 87, in <module>
    function(**function_args)
  File "/usr/local/lib/python3.7/site-packages/my-package/chart.py", line 107, in reinstall
    install()
  File "/usr/local/lib/python3.7/site-packages/my-package/chart.py", line 89, in install
    repo_sources_tar_object = tarfile.TarFile(fileobj=repo_sources_file_object)
  File "/usr/local/lib/python3.7/tarfile.py", line 1484, in __init__
    self.firstmember = self.next()
  File "/usr/local/lib/python3.7/tarfile.py", line 2301, in next
    raise ReadError(str(e))
tarfile.ReadError: invalid header

Python 版本:3.7

【问题讨论】:

    标签: python python-3.x archive tarfile gunzip


    【解决方案1】:

    我从直接实例化tarfile.TarFile object 切换到使用the tarfile.open() constructor,并修复了它:

    repo_sources_tar_object = tarfile.open(fileobj=repo_sources_file_object)
    

    文档中实际上对此有警告,here:

    不要直接使用这个类:使用 tarfile.open() 代替。

    【讨论】:

      【解决方案2】:

      最佳做法是使用上下文管理器,以便在作业完成后自动关闭文件。

      可以写:

      import contextlib
      import io
      import tarfile
      
      gzip_archive_bytes_content = b"..."
      repo_sources_file_object = io.BytesIO(gzip_archive_bytes_content)
      
      with contextlib.closing(tarfile.open(fileobj=repo_sources_file_object)) as arch:
          arch.extractall(path="/tmp/")
      

      这适用于tarfile.TarFile,但不适用于tarfile.open()。 所以你可以写:

      with tarfile.TarFile(...) as arch:
          ...
      

      【讨论】:

        猜你喜欢
        • 2021-06-04
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2017-12-29
        • 1970-01-01
        相关资源
        最近更新 更多