【问题标题】:Padding added in Base64 even when string length is multiple of three即使字符串长度是三的倍数,也会在 Base64 中添加填充
【发布时间】:2015-10-15 12:12:54
【问题描述】:

我有以下字符串及其 Base64 编码版本:

temp = "Last Star Wars 'not for children'\n\nThe sixth and final Star Wars movie may not be suitable for young children, film-maker George Lucas has said.\n\nHe told US TV show 60 Minutes that Revenge of the Sith would be the darkest and most violent of the series. \"I don't think I would take a five or six-year-old to this,\" he told the CBS programme, to be aired on Sunday. Lucas predicted the film would get a US rating advising parents some scenes may be unsuitable for under-13s. It opens in the UK and US on 19 May. He said he expected the film would be classified PG-13 - roughly equivalent to a British 12A rating.\n\nThe five previous Star Wars films have all carried less restrictive PG - parental guidance - ratings in the US. In the UK, they have all been passed U - suitable for all - with the exception of Attack of The Clones, which got a PG rating in 2002. Revenge of the Sith - the third prequel to the original 1977 Star Wars film - chronicles the transformation of the heroic Anakin Skywalker into the evil Darth Vader as he travels to a Hell-like planet composed of erupting volcanoes and molten lava. \"We're going to watch him make a pact with the devil,\" Lucas said. \"The film is much more dark, more emotional. It's much more of a tragedy.\"\n"

temp_enc = "TGFzdCBTdGFyIFdhcnMgJ25vdCBmb3IgY2hpbGRyZW4nXG5cblRoZSBzaXh0aCBhbmQgZmluYWwgU3RhciBXYXJzIG1vdmllIG1heSBub3QgYmUgc3VpdGFibGUgZm9yIHlvdW5nIGNoaWxkcmVuLCBmaWxtLW1ha2VyIEdlb3JnZSBMdWNhcyBoYXMgc2FpZC5cblxuSGUgdG9sZCBVUyBUViBzaG93IDYwIE1pbnV0ZXMgdGhhdCBSZXZlbmdlIG9mIHRoZSBTaXRoIHdvdWxkIGJlIHRoZSBkYXJrZXN0IGFuZCBtb3N0IHZpb2xlbnQgb2YgdGhlIHNlcmllcy4gXCJJIGRvbid0IHRoaW5rIEkgd291bGQgdGFrZSBhIGZpdmUgb3Igc2l4LXllYXItb2xkIHRvIHRoaXMsXCIgaGUgdG9sZCB0aGUgQ0JTIHByb2dyYW1tZSwgdG8gYmUgYWlyZWQgb24gU3VuZGF5LiBMdWNhcyBwcmVkaWN0ZWQgdGhlIGZpbG0gd291bGQgZ2V0IGEgVVMgcmF0aW5nIGFkdmlzaW5nIHBhcmVudHMgc29tZSBzY2VuZXMgbWF5IGJlIHVuc3VpdGFibGUgZm9yIHVuZGVyLTEzcy4gSXQgb3BlbnMgaW4gdGhlIFVLIGFuZCBVUyBvbiAxOSBNYXkuIEhlIHNhaWQgaGUgZXhwZWN0ZWQgdGhlIGZpbG0gd291bGQgYmUgY2xhc3NpZmllZCBQRy0xMyAtIHJvdWdobHkgZXF1aXZhbGVudCB0byBhIEJyaXRpc2ggMTJBIHJhdGluZy5cblxuVGhlIGZpdmUgcHJldmlvdXMgU3RhciBXYXJzIGZpbG1zIGhhdmUgYWxsIGNhcnJpZWQgbGVzcyByZXN0cmljdGl2ZSBQRyAtIHBhcmVudGFsIGd1aWRhbmNlIC0gcmF0aW5ncyBpbiB0aGUgVVMuIEluIHRoZSBVSywgdGhleSBoYXZlIGFsbCBiZWVuIHBhc3NlZCBVIC0gc3VpdGFibGUgZm9yIGFsbCAtIHdpdGggdGhlIGV4Y2VwdGlvbiBvZiBBdHRhY2sgb2YgVGhlIENsb25lcywgd2hpY2ggZ290IGEgUEcgcmF0aW5nIGluIDIwMDIuIFJldmVuZ2Ugb2YgdGhlIFNpdGggLSB0aGUgdGhpcmQgcHJlcXVlbCB0byB0aGUgb3JpZ2luYWwgMTk3NyBTdGFyIFdhcnMgZmlsbSAtIGNocm9uaWNsZXMgdGhlIHRyYW5zZm9ybWF0aW9uIG9mIHRoZSBoZXJvaWMgQW5ha2luIFNreXdhbGtlciBpbnRvIHRoZSBldmlsIERhcnRoIFZhZGVyIGFzIGhlIHRyYXZlbHMgdG8gYSBIZWxsLWxpa2UgcGxhbmV0IGNvbXBvc2VkIG9mIGVydXB0aW5nIHZvbGNhbm9lcyBhbmQgbW9sdGVuIGxhdmEuIFwiV2UncmUgZ29pbmcgdG8gd2F0Y2ggaGltIG1ha2UgYSBwYWN0IHdpdGggdGhlIGRldmlsLFwiIEx1Y2FzIHNhaWQuIFwiVGhlIGZpbG0gaXMgbXVjaCBtb3JlIGRhcmssIG1vcmUgZW1vdGlvbmFsLiBJdCdzIG11Y2ggbW9yZSBvZiBhIHRyYWdlZHkuXCJcbg=="

>>> len(temp)
1251
>>> len(temp_enc)
1688
>>> len(temp)/3
417
>>> (len(temp)/3)*4
1668

字符串的长度可以被 3 整除。既然每 3 个字节我们有 4 个字节的编码,那么为什么编码后的字符串比预期的要长?为什么要在编码中添加填充?

【问题讨论】:

  • 你是如何进行编码的?我拿了你的字符串temp 并按照以下方式对其进行编码:import base64 temp_enc = base64.b64encode(temp) 当我这样做时 len(temp_enc) 我得到 1668,你的数学显示是正确的值。
  • len(temp_enc)%4 必须为零 ...

标签: python python-2.7 base64


【解决方案1】:

temp_enc不是temp的base64编码:

In [61]: import base64
In [62]: base64.b64encode(temp) == temp_enc
Out[62]: False

如果解码temp_enc,解码后的字符串长度为1264,而不是1251:

In [57]: temp_dec = base64.b64decode(temp_enc)

In [58]: len(temp_dec)
Out[58]: 1264

In [59]: len(temp)
Out[59]: 1251

temp 包含换行符,\ntemp_dec 包含文字反斜杠,后跟ns:

In [67]: temp[:50]
Out[67]: "Last Star Wars 'not for children'\n\nThe sixth and f"

In [66]: temp_dec[:50]
Out[66]: "Last Star Wars 'not for children'\\n\\nThe sixth and"

如果你把temp = base64.b64decode(temp_enc))当成真正的temp,那么

In [56]: math.ceil(len(base64.b64decode(temp_enc))/3.0)*4
Out[56]: 1688.0

等于

In [49]: len(temp_enc)
Out[49]: 1668

这与 temp 的每 3 个三个字节转换为 temp_enc 的 4 个字节的说法是一致的。

【讨论】:

    猜你喜欢
    • 2023-04-01
    • 2021-04-27
    • 1970-01-01
    • 1970-01-01
    • 2021-04-26
    • 2018-05-27
    • 1970-01-01
    • 2013-09-20
    • 1970-01-01
    相关资源
    最近更新 更多