PIL：来自 url 的图像，无法识别图像文件答案

【问题标题】：PIL: image from url, cannot identify image filePIL：来自 url 的图像，无法识别图像文件
【发布时间】：2013-09-01 17:28:38
【问题描述】：

我正在尝试从 url 访问图像：

http://www.lifeasastrawberry.com/wp-content/uploads/2013/04/IMG_1191-1024x682.jpg

但是，它在最后一步失败并显示 IOError("cannot identify image file")。不知道发生了什么或如何解决它。它已与许多其他 url 图片一起使用。

    opener = urllib2.build_opener()
    opener.addheaders = [('User-agent', 'Mozilla/5.0')]
    opener.addheaders = [('Accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8')]
    opener.addheaders = [('Accept-Encoding', 'gzip,deflate,sdch')]

    response = opener.open(image_url,None,5)
    img_file = cStringIO.StringIO(response.read())  

    image = Image.open(img_file)

这个网址也失败了：

http://www.canadianliving.com/img/photos/biz/Greek-Yogurt-Ceaser-Salad-Dressi1365783448.jpg

【问题讨论】：

如果我不将标题添加到opener，它对我有用。

标签： python http encoding gzip python-imaging-library

【解决方案1】：

问题在于您告诉 URL 检索器从服务器请求 gzip 编码的结果，因此您收到的图像数据是 gzip 编码的。您可以通过从请求中删除 accept-encoding 标头或手动解压缩 gzip 编码的结果来解决此问题：

from PIL import Image
import urllib2
import gzip
import cStringIO

opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
opener.addheaders = [('Accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8')]
opener.addheaders = [('Accept-Encoding', 'gzip,deflate,sdch')]

gzipped_file = cStringIO.StringIO(opener.open(url, None, 5).read())
image = Image.open(gzip.GzipFile(fileobj=gzipped_file))

这种方法的问题在于，如果您在 HTTP 请求中接受多个编码，那么您需要查看结果的 HTTP 标头以查看您实际获得的编码，然后根据该编码手动解码值表示。

我认为将 accept-encoding 标头设置为一个值会更容易，这样您将只接受一种编码（例如，'identity;q=1, *;q=0' 或类似的东西），或者继续使用requests package 来执行 HTTP .

【讨论】：

或者使用 requests 库，automatically decompresses gzip-encoded responses.
@PauloAlmeida 是的，这是一个很好的建议。希望这个答案能够澄清 OP 问题的根本原因。
当然，我认为你的答案是正确的，只是添加信息:)
如果我删除了 accept-encoding 标头，我不会冒一些图像坚持这样发送的风险，因此我不会得到它们吗？
@user984003 是的，没错——在这方面，HTTP 是一个雷区。最好的办法可能是使用 Paulo 的建议并开始使用 requests package。