# Base64 (C)

some working (albeit rather minimal) code for Base 64 decoding, RFC3548, originally hacked as a quick fix on a box without a decent mail reader.

## theory

This program is almost not worth writing: the core of the Base 64 conversion consists of the isomorphism between 24-bit bitstrings (presented on input as 4 groups of 6 bits and output as 3 groups of 8 bits).

The slight complications are that:

• strings to be coded are not necessarily multiples of 24 bits; we must be prepared to handle a runt conversion for the last few characters.
• the 6-bit groups are injected into regular 8-bit bytes, so it may take an arbitrary number of source characters in order to accumulate 4 coding characters.

Note that this is easier than the general metamorphism; we may not have 1:1 correspondence between source characters and groups, but each group is translated independently — we never have to "carry" bits from the computation of one group to that of the next.

## practice

Translating a 24 bit chunk is a simple matter of shifting all the bits into the appropriate places. (we use `nbytes[phase]` to handle runt conversions)

```<<translate chunk>>=
xlate(in,phase);
<<define translation>>=
void xlate(unsigned char *in, int phase)
{
unsigned char out[3];
out[0] = in[0] << 2 | in[1] >> 4;
out[1] = in[1] << 4 | in[2] >> 2;
out[2] = in[2] << 6 | in[3] >> 0;
fwrite(out, nbytes[phase], 1, stdout);
}
```

We must process the input a character, c, at a time even though most of the time we will be translating it in four-character chunks.

This is because, as mentioned above,

• the last chunk may be less than four characters (signalled by an equals sign, `'='`)
• arbitrary non-coding characters are ignored, so accumulating a chunk may require more than four characters of source text.
```<<process input>>=
if(c == '=')    {
translate chunk
break;
}
p = strchr(b64, c);
if(p)    {
in[phase] = p - b64;
phase = (phase + 1) % 4;
if(phase == 0)    {
translate chunk
in[0]=in[1]=in[2]=in[3]=0;
}
}
```

## wrapping up

Finally, we put it all together in a small filter program

```<<base64.c>>=
#include <stdio.h>
#include <string.h>
char b64[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
int nbytes[] = { 3, 1, 1, 2 };
define translation
int main()
{
int c, phase;
unsigned char in[4];
char *p;
phase = 0;
while((c = getchar()) != EOF)    {
process input
}
return 0;
}
```

and verify that

`TGl0ZXJhdGVQcm9ncmFtcw==`

decodes to

`LiteratePrograms`