Base64 is a group of similar binary-to-text encoding schemes that represent binary data in an ASCII string format by translating it into a radix-64 representation. The term Base64 originates from a specific MIME content transfer encoding.
Base64 encoding schemes are commonly used when there is a need to encode binary data that needs to be stored and transferred over media that are designed to deal with textual data. This is to ensure that the data remain intact without modification during transport. Base64 is commonly used in a number of applications including email via MIME, and storing complex data in XML.
In JavaScript there are two functions respectively for decoding and encoding base64 strings:
The atob()
function decodes a string of data which has been encoded using base-64 encoding. Conversely, the btoa()
function creates a base-64 encoded ASCII string from a "string" of binary data.
Both atob()
and btoa()
work on strings. If you want to work on ArrayBuffers
, please, read this paragraph.
Documentation
|
Tools
Related Topics |
The "Unicode Problem"
Since DOMString
s are 16-bit-encoded strings, in most browsers calling window.btoa
on a Unicode string will cause a Character Out Of Range
exception if a character exceeds the range of a 8-bit byte (0x00~0xFF). There are two possible methods to solve this problem:
- the first one is to escape the whole string (with UTF-8, see
encodeURIComponent
) and then encode it; - the second one is to convert the UTF-16
DOMString
to an UTF-8 array of characters and then encode it.
Here are the two possible methods.
Solution #1 – escaping the string before encoding it
function b64EncodeUnicode(str) { // first we use encodeURIComponent to get percent-encoded UTF-8, // then we convert the percent encodings into raw bytes which // can be fed into btoa. return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g, function toSolidBytes(match, p1) { return String.fromCharCode('0x' + p1); })); } b64EncodeUnicode('✓ à la mode'); // "4pyTIMOgIGxhIG1vZGU=" b64EncodeUnicode('\n'); // "Cg=="
To decode the Base64-encoded value back into a String:
function b64DecodeUnicode(str) { // Going backwards: from bytestream, to percent-encoding, to original string. return decodeURIComponent(atob(str).split('').map(function(c) { return '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2); }).join('')); } b64DecodeUnicode('4pyTIMOgIGxhIG1vZGU='); // "✓ à la mode" b64DecodeUnicode('Cg=='); // "\n"
Unibabel is a library which includes common conversions using this strategy.
Solution #2 – rewrite the DOMs atob()
and btoa()
using JavaScript's TypedArray
s and UTF-8
Use a TextEncoder polyfill such as TextEncoding (also includes legacy windows, mac, and ISO encodings), TextEncoderLite, or Buffer and a Base64 polyfill such as base64-js.
The simplest, most light-weight solution would be to use TextEncoderLite and base64-js.
This function assumes using base64-js imported as minified <script type="text/javascript" src="base64js.min.js"/>
function base64EncodingUTF8(str) { var encoded = new TextEncoderLite('utf-8').encode(str); var b64Encoded = base64js.fromByteArray(encoded); return b64Encoded; }