Base64 (C)
From LiteratePrograms
some working (albeit rather minimal) code for Base 64 decoding, RFC3548, originally hacked as a quick fix on a box without a decent mail reader.
theory
This program is almost not worth writing: the core of the Base 64 conversion consists of the isomorphism between 24-bit bitstrings (presented on input as 4 groups of 6 bits and output as 3 groups of 8 bits).
The slight complications are that:
- strings to be coded are not necessarily multiples of 24 bits; we must be prepared to handle a runt conversion for the last few characters.
- the 6-bit groups are injected into regular 8-bit bytes, so it may take an arbitrary number of source characters in order to accumulate 4 coding characters.
Note that this is easier than the general metamorphism; we may not have 1:1 correspondence between source characters and groups, but each group is translated independently — we never have to "carry" bits from the computation of one group to that of the next.
practice
Translating a 24 bit chunk is a simple matter
of shifting all the bits
into the appropriate places.
(we use nbytes[phase]
to handle runt conversions)
<<translate chunk>>= xlate(in,phase); <<define translation>>= void xlate(unsigned char *in, int phase) { unsigned char out[3]; out[0] = in[0] << 2 | in[1] >> 4; out[1] = in[1] << 4 | in[2] >> 2; out[2] = in[2] << 6 | in[3] >> 0; fwrite(out, nbytes[phase], 1, stdout); }
We must process the input a character, c, at a time even though most of the time we will be translating it in four-character chunks.
This is because, as mentioned above,
-
the last chunk may be less than four characters (signalled by an equals sign,
'='
) - arbitrary non-coding characters are ignored, so accumulating a chunk may require more than four characters of source text.
<<process input>>= if(c == '=') { translate chunk break; } p = strchr(b64, c); if(p) { in[phase] = p - b64; phase = (phase + 1) % 4; if(phase == 0) { translate chunk in[0]=in[1]=in[2]=in[3]=0; } }
wrapping up
Finally, we put it all together in a small filter program
<<base64.c>>= #include <stdio.h> #include <string.h> char b64[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"; int nbytes[] = { 3, 1, 1, 2 }; define translation int main() { int c, phase; unsigned char in[4]; char *p; phase = 0; while((c = getchar()) != EOF) { process input } return 0; }
and verify that
TGl0ZXJhdGVQcm9ncmFtcw==
decodes to
LiteratePrograms