Pandas python，工作簿编码类型是什么？答案

【问题标题】：Pandas python, what is the workbook encoding type?Pandas python，工作簿编码类型是什么？
【发布时间】：2016-10-21 14:35:34
【问题描述】：

我是 python 的新手，也是 Python 中的 pandas 库的新手。该文档没有很好地描述，他们也没有很好地解释它。我想将数据框保存为 excel 格式并在内存中，我找到了以下解释： [Pandas excel to the memory]

我需要关于workbook 的解释。这个变量的值是编码的，我怎么才能看到这个变量的真实值呢？如何解码？它的返回值应该是什么？

编辑：

如何将其传递到 Mandrill api 中的附件内容中。 https://mandrillapp.com/api/docs/messages.python.html

这是excel extension的附件部分：

'attachments': [
            {
                'content': content,
                'name': 'fraud_report.xlsx',
                'type': 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet'
            }

我无法打开 excel 文件，并且一直收到来自 Microsoft excel 的错误消息，上面写着 the file format is not valid!... 任何帮助都会有所帮助。谢谢

【问题讨论】：

标签： python pandas mandrill

【解决方案1】：

为了解释，我再次将您链接中的示例粘贴到此处：

# Safe import for either Python 2.x or 3.x
try:
    from io import BytesIO
except ImportError:
    from cStringIO import StringIO as BytesIO

bio = BytesIO()

# By setting the 'engine' in the ExcelWriter constructor.
writer = ExcelWriter(bio, engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')

# Save the workbook
writer.save()

# Seek to the beginning and read to copy the workbook to a variable in memory
bio.seek(0)
workbook = bio.read()

writer.save() 方法将数据保存在 BytesIO (bio) 中，而不是 Excel 文件中。也就是说，变量bio存储了excel文件的字节码。

bio.seek(0) 方法将bio 的当前位置（用于读取、写入...）设置为0。这样就可以用下一个方法bio.read()从头开始读取bio的数据了。

变量workbook存储excel文件（或excel工作簿）的字节串。如果你以字节模式读取一个excel文件，你会得到相同的数据。或者你可以写在一个excel文件中：

with open("my_excel_file.xlsx", "wb") as f:
   f.write(workbook)

要从 bio 读取数据并存储在 DataFrame 中，您不需要 bio.read()：

bio.seek(0)
df = pd.read_excel(bio, "Sheet1", engine="xlrd")

关于使用 mandrill 的问题：

在 mandrill 的示例中，您会看到：

{'attachments': [{'content': 'ZXhhbXBsZSBmaWxl',
                      'name': 'myfile.txt',
                      'type': 'text/plain'}],...

文档也写到了：

content：附件的内容，base64 编码的字符串

您应该将workbook 编码为base64 并将其用于发送

import base64
content = base64.b64encode(workbook)

P/S：workbook 和 content 的类型为 bytes。可能您需要在发送前将content 转换为str。

{'attachments': [{'content': content.decode('utf-8'),
                          'name': 'myfile.xlsx',
                          'type': 'text/plain'}],...

补充：如果文件是excel，那么你应该把type改成application/vnd.openxmlformats-officedocument.spreadsheetml.sheet

【讨论】：

我想补充一下我在https://mandrillapp.com/api/docs/messages.python.html中使用mandrillapp来发送消息，所以我不想使用write方法。我想只存储在一个变量中，然后将其发送到附件的content。你对我有其他建议吗？
我不知道mandrillapp。但是您要发送DataFrame 或Excel File 的内容吗？
Mandrill 只接受变量，然后自己构建文件。此应用程序不读取文件。所以我不能给它一个文件。
您要使用mandrillapp 发送带有mailchimp 的邮件，并以excel 文件作为附件吗？
是的，没错！我会，据我所知是不可能传递文件的，所以我想阅读它，然后在 Mandril 的帮助下创建 excel。此外，在您的解释中，df = pd.read_excel(bio, "Sheet1", engine="xlrd") 应该替代workbook = bio.read()。不清楚，如何阅读以及前面的行是什么。