Befunge interpreter (C)

From LiteratePrograms

Jump to: navigation, search

Contents

Introduction

This article describes a simple interpreter for the Befunge-93 esoteric programming language.

The language, which, by design, is extremely difficult to compile to machine code, is actually very easy to interpret. A program in Befunge is actually a two-dimensional grid of one-byte commands. This makes it very different from most other languages, which are programmed with a sequence of tokens. The program execution can go in any direction (up, down, left or right). This complicates the task of compiling it to a machine code, which only allows one direction. Another feature causing trouble for a compiler is Befunge's ability to change its own code in runtime.

These complications are not a problem for an interpreter. In fact, the very simple syntax of Befunge, with all commands being exactly one byte long, makes the task of writing an interpreter simple.

Implementation

Reading the program code

This implementation reads a program from stdin or a file specified on the command line. The code is stored in the prog array.

<<main>>=
int main(int argc, char *argv[])
{
	char prog[NLINES][NCOLS+1]={{0}};
	int ncol=0, nline=0;
	int cmd=0;
	int dir=dir_right;
	long tmp;
	int textmode=0;
	int step=1;
	int y, x;
	FILE *fp;
	if(argc>1) {
		if(!(fp=fopen(argv[1], "r"))) {
			fprintf(stderr, "ERROR: Can\'t open file %s\n", argv[1]);
			exit(EXIT_FAILURE);
		}
	} else fp=stdin;
	while(nline<NLINES && fgets(prog[nline], NCOLS+1, fp)) ++nline;
	if(fp!=stdin) fclose(fp);
init random

Handling commands

A Befunge program starts in the upper-left corner of the code. Initially, the execution direction is right. We use nline and ncol as the "program counter". These are zero-based indexes into the prog array containing the program.

The main loop is terminated by @, which is the Befunge command to end a program.

<<main>>=
	nline=0;
	while((cmd=prog[nline][ncol])!='@' || textmode) {

A "-character is the Befunge command for entering text mode. In this mode, the ASCII-value of any character encountered until the next '"' is pushed on the stack.

<<main>>=
		if(textmode) {
			if(cmd=='\"') textmode=0;
			else push(cmd);
		} else {

Arithmetic and logic commands

As Befunge is a stack-based language, most commands involve stack-manipulation. The basic arithmetic command will pop two values from the stack, evaluate the specific arithmetic operation, and push the result back.

<<main>>=
			switch(cmd) {
			case '+': push(pop()+pop()); break;
			case '-': tmp=pop(); push(pop()-tmp); break;
			case '*': push(pop()*pop()); break;
			case '/': tmp=pop(); push(pop()/tmp); break;
			case '%': tmp=pop(); push(pop()%tmp); break;

The negation command (!) works in the same way as the corresponding C operator, resulting in 0 for all non-zero values, or 1 for 0.

<<main>>=
			case '!': push(!pop()); break;

The only comparison command in Befunge is `, which will pop two values, and push 1 if the second is larger than the first (0 otherwise).

<<main>>=
			case '`': tmp=pop(); push(pop()>tmp); break;

Program direction commands

There are two kinds of predictive commands for changing program direction. The first kind (_ and |) will first pop a value, and change direction based on the value. This can be used to implement branching and loops.

The other kind will change the direction unconditionally.

<<main>>=
			case '_': dir=pop()?dir_left:dir_right; break;
			case '|': dir=pop()?dir_up:dir_down; break;
			case '<': dir=dir_left; break;
			case '>': dir=dir_right; break;
			case '^': dir=dir_up; break;
			case 'v': dir=dir_down; break;

Stack operations

Befunge also supports the common operations (push, pop, dup and swap) that are available in all stack-based language. The push operation is implicit for any number.

<<main>>=
			case ':': dup(); break;
			case '\\': swap(); break;
			case '$': pop(); break;

Output

A very basic functionality for writing to stdout is provided with the . and , commands.

<<main>>=
			case '.': printf("%ld", pop()); break;
			case ',': putchar(pop()); break;

Special commands

Befunge has some special commands that influence how the code is handled. " is used to enter text mode, where all commands are pushed on the stack (by ASCII-value). The # command is used to "jump" over another command. Finally, the @ command will just stop program execution.

<<main>>=
			case '"': textmode=1; break;
			case '#': step=2; break;
			case '@': exit(EXIT_SUCCESS);

Self-modifying code

The p and g commands provide the facilities that make Befunge programs really difficult to compile. These commands allow the user to change the program-code itself while the program is running.

<<main>>=
			case 'p': y=pop(); x=pop(); prog[y][x]=pop(); break;
			case 'g': y=pop(); x=pop(); push(prog[y][x]); break;

Input

Befunge has two commands for reading from stdin. These will read either a character or a number, and push it on the stack.

<<main>>=
			case '&': scanf("%ld", &tmp); push(tmp); break;
			case '~': push(getchar()); break;

Random

The random command will just change the program direction randomly.

<<init random>>=
	srand(time(NULL));
<<main>>=
			case '?': dir=(rand()/7)&3; break;

Pushing numbers

When a digit is encountered, its corresponding numeric value is pushed.

<<main>>=
			default:
				if(cmd>='0' && cmd<='9') push(cmd-'0');
			}
		}

Program flow

The ncol and nline variables makes up the "program counter". These variables are manipulated according to the current program direction stored in dir. The step variable (normaloly 1) is set to 2 by the # command to indicate a "jump" over the next command.

When we reach the end of code in any direction, we continue from the other end.

<<main>>=
		while(step) {
			switch(dir) {
			case dir_left: 
				if(--ncol<0) ncol=NCOLS-1;
				break;
			case dir_right: 
				if(++ncol>=NCOLS) ncol=0;
				break;
			case dir_up: 
				if(--nline<0) nline=NLINES-1;
				break;
			case dir_down: 
				if(++nline>=NLINES) nline=0;
				break;
			}
			--step;
		}
		++step;
	}
	return EXIT_SUCCESS;
}

Stack functions

The stack manipulation functions are like they are in most other stack-based language, with one exception: When poping from an empty stack, we get 0.

<<pop>>=
long pop(void)
{
	if(nstack>0) return stack[--nstack];
	else return 0;
}

The push() function uses a dynamic array, which is realloced by INITSTACK words when needed. This stack will grow as big as needed (unless we are out of memory), but will never release any memory when not needed anymore.

<<stack functions>>=
void push(long v)
{
	if(!stack) {
		if(!maxstack) maxstack=INITSTACK;
		if(!(stack=malloc(sizeof *stack * maxstack))) {
			fprintf(stderr, "ERROR: Not enough memory.\n");
			exit(EXIT_FAILURE);
		}
	} else if(nstack<maxstack) stack[nstack++]=v;
	else {
		maxstack+=INITSTACK;
		if(!(stack=realloc(stack, sizeof *stack * maxstack))) {
			fprintf(stderr, "ERROR: Not enough memory.\n");
			exit(EXIT_FAILURE);
		}
	}
}
pop

The dup() and swap() functions are implemented with push() and pop(). This doesn't give optimal performance, but ensures that the special case of an empty stack is handled the same way as for pop().

<<stack functions>>=
void dup(void)
{
	long tmp;
	tmp=pop();
	push(tmp);
	push(tmp);
}
void swap(void)
{
	long tmp, tmp2;
	tmp=pop();
	tmp2=pop();
	push(tmp);
	push(tmp2);
}

Wrapping up

Befunge-93 is limited to a program-size of 25 lines with maximum 80 columns in each.

<<limits>>=
#define NCOLS         80
#define NLINES        25

To avoid lots of reallocs, we use a big(4096 words) initial stack.

<<limits>>=
#define INITSTACK   4096

To compile the interpreter, it must have some includes and global variables.

<<befunge.c>>=
#include<stdio.h>
#include<stdlib.h>
#include<ctype.h>
#include<time.h>
limits
enum {dir_left=0, dir_right=1, dir_up=2, dir_down=3};
long *stack=NULL;
int maxstack=0;
int nstack=0;
stack functions
main
Views