即使字符串长度为3的倍数,填充也会添加到Base64中

时间:2015-07-24 18:04:30

标签: python python-2.7 base64

我有以下字符串及其Base64编码版本:

temp = "Last Star Wars 'not for children'\n\nThe sixth and final Star Wars movie may not be suitable for young children, film-maker George Lucas has said.\n\nHe told US TV show 60 Minutes that Revenge of the Sith would be the darkest and most violent of the series. \"I don't think I would take a five or six-year-old to this,\" he told the CBS programme, to be aired on Sunday. Lucas predicted the film would get a US rating advising parents some scenes may be unsuitable for under-13s. It opens in the UK and US on 19 May. He said he expected the film would be classified PG-13 - roughly equivalent to a British 12A rating.\n\nThe five previous Star Wars films have all carried less restrictive PG - parental guidance - ratings in the US. In the UK, they have all been passed U - suitable for all - with the exception of Attack of The Clones, which got a PG rating in 2002. Revenge of the Sith - the third prequel to the original 1977 Star Wars film - chronicles the transformation of the heroic Anakin Skywalker into the evil Darth Vader as he travels to a Hell-like planet composed of erupting volcanoes and molten lava. \"We're going to watch him make a pact with the devil,\" Lucas said. \"The film is much more dark, more emotional. It's much more of a tragedy.\"\n"

temp_enc = "TGFzdCBTdGFyIFdhcnMgJ25vdCBmb3IgY2hpbGRyZW4nXG5cblRoZSBzaXh0aCBhbmQgZmluYWwgU3RhciBXYXJzIG1vdmllIG1heSBub3QgYmUgc3VpdGFibGUgZm9yIHlvdW5nIGNoaWxkcmVuLCBmaWxtLW1ha2VyIEdlb3JnZSBMdWNhcyBoYXMgc2FpZC5cblxuSGUgdG9sZCBVUyBUViBzaG93IDYwIE1pbnV0ZXMgdGhhdCBSZXZlbmdlIG9mIHRoZSBTaXRoIHdvdWxkIGJlIHRoZSBkYXJrZXN0IGFuZCBtb3N0IHZpb2xlbnQgb2YgdGhlIHNlcmllcy4gXCJJIGRvbid0IHRoaW5rIEkgd291bGQgdGFrZSBhIGZpdmUgb3Igc2l4LXllYXItb2xkIHRvIHRoaXMsXCIgaGUgdG9sZCB0aGUgQ0JTIHByb2dyYW1tZSwgdG8gYmUgYWlyZWQgb24gU3VuZGF5LiBMdWNhcyBwcmVkaWN0ZWQgdGhlIGZpbG0gd291bGQgZ2V0IGEgVVMgcmF0aW5nIGFkdmlzaW5nIHBhcmVudHMgc29tZSBzY2VuZXMgbWF5IGJlIHVuc3VpdGFibGUgZm9yIHVuZGVyLTEzcy4gSXQgb3BlbnMgaW4gdGhlIFVLIGFuZCBVUyBvbiAxOSBNYXkuIEhlIHNhaWQgaGUgZXhwZWN0ZWQgdGhlIGZpbG0gd291bGQgYmUgY2xhc3NpZmllZCBQRy0xMyAtIHJvdWdobHkgZXF1aXZhbGVudCB0byBhIEJyaXRpc2ggMTJBIHJhdGluZy5cblxuVGhlIGZpdmUgcHJldmlvdXMgU3RhciBXYXJzIGZpbG1zIGhhdmUgYWxsIGNhcnJpZWQgbGVzcyByZXN0cmljdGl2ZSBQRyAtIHBhcmVudGFsIGd1aWRhbmNlIC0gcmF0aW5ncyBpbiB0aGUgVVMuIEluIHRoZSBVSywgdGhleSBoYXZlIGFsbCBiZWVuIHBhc3NlZCBVIC0gc3VpdGFibGUgZm9yIGFsbCAtIHdpdGggdGhlIGV4Y2VwdGlvbiBvZiBBdHRhY2sgb2YgVGhlIENsb25lcywgd2hpY2ggZ290IGEgUEcgcmF0aW5nIGluIDIwMDIuIFJldmVuZ2Ugb2YgdGhlIFNpdGggLSB0aGUgdGhpcmQgcHJlcXVlbCB0byB0aGUgb3JpZ2luYWwgMTk3NyBTdGFyIFdhcnMgZmlsbSAtIGNocm9uaWNsZXMgdGhlIHRyYW5zZm9ybWF0aW9uIG9mIHRoZSBoZXJvaWMgQW5ha2luIFNreXdhbGtlciBpbnRvIHRoZSBldmlsIERhcnRoIFZhZGVyIGFzIGhlIHRyYXZlbHMgdG8gYSBIZWxsLWxpa2UgcGxhbmV0IGNvbXBvc2VkIG9mIGVydXB0aW5nIHZvbGNhbm9lcyBhbmQgbW9sdGVuIGxhdmEuIFwiV2UncmUgZ29pbmcgdG8gd2F0Y2ggaGltIG1ha2UgYSBwYWN0IHdpdGggdGhlIGRldmlsLFwiIEx1Y2FzIHNhaWQuIFwiVGhlIGZpbG0gaXMgbXVjaCBtb3JlIGRhcmssIG1vcmUgZW1vdGlvbmFsLiBJdCdzIG11Y2ggbW9yZSBvZiBhIHRyYWdlZHkuXCJcbg=="

>>> len(temp)
1251
>>> len(temp_enc)
1688
>>> len(temp)/3
417
>>> (len(temp)/3)*4
1668

字符串的长度可被3整除。因为对于每3个字节,我们有4个字节的编码,那么为什么编码的字符串比预期的长?为什么填充添加到编码中?

1 个答案:

答案 0 :(得分:1)

temp_enc temp的base64编码:

In [61]: import base64
In [62]: base64.b64encode(temp) == temp_enc
Out[62]: False

如果解码temp_enc,则解码后的字符串长度为1264,而不是1251:

In [57]: temp_dec = base64.b64decode(temp_enc)

In [58]: len(temp_dec)
Out[58]: 1264

In [59]: len(temp)
Out[59]: 1251

temp包含换行符,\ntemp_dec包含后跟n的文字反斜杠:

In [67]: temp[:50]
Out[67]: "Last Star Wars 'not for children'\n\nThe sixth and f"

In [66]: temp_dec[:50]
Out[66]: "Last Star Wars 'not for children'\\n\\nThe sixth and"

如果您将temp = base64.b64decode(temp_enc))作为真实temp,那么

In [56]: math.ceil(len(base64.b64decode(temp_enc))/3.0)*4
Out[56]: 1688.0

等于

In [49]: len(temp_enc)
Out[49]: 1668

这与temp的每3个三字节转换为temp_enc的4个字节的说法一致。