We often need to encode binary data into ASCII strings. The standards (e.g., email) to do so include base16, base32 and base64.
There are some research papers on fast base64 encoding and decoding: Base64 encoding and decoding at almost the speed of a memory copy and Faster Base64 Encoding and Decoding using AVX2 Instructions.
For the most parts, these base64 techniques are applicable to base32. Base32 works in the following manner: you use 8 ASCII characters to encode 5 bytes. Each ASCII characters carries 5 bits of information: it can be one of 32 characters. For reference, base64 uses 64 different ASCII characters so each character carries more information. However, base64 requires using both upper case and lower case letters and other special characters, so it is less portable. Base32 can be case invariant.
There are different variations, but we can consider Base 32 Encoding with Extended Hex Alphabet which uses the letters 0 to 9 for the values 0 to 9 and the letters A to V for the numbers from 10 to 31. So each character represents a value between 0 to 31. If required, you can pad the coding with the ‘=’ character so that it is divisible by 8 characters. However, that is not always required. Instead, you may simply stop decoding as soon as an out-of-range character is found.
‘0’ | 0 |
‘1’ | 1 |
‘2’ | 2 |
… | … |
‘4’ | 4 |
‘9’ | 9 |
‘A’ | 10 |
… | … |
‘V’ | 31 |
A conventional decoder might use branchy code:
if (ch >= '0' && ch <= '9') d