Hello World (IBM PC bootstrap)
From LiteratePrograms
- Other implementations: Ada | ALGOL 68 | Alice ML | Amiga E | Applescript | AspectJ | Assembly Intel x86 Linux | Assembly Intel x86 NetBSD | AWK | bash | BASIC | Batch files | C | C, Cairo | C, Xlib | Candle | Clojure | C++ | C# | Delphi | Dylan | E | Eiffel | Erlang | Forth | FORTRAN | Fortress | Go | Groovy | Haskell | Hume | IBM PC bootstrap | Inform 7 | Java | Java, Swing | JavaScript | LaTeX | Lisp | Logo | Lua | Maple | MATLAB | Mercury | OCaml/F Sharp | occam | Oz | Pascal | Perl | PHP | Pic | PIR | PLI | PostScript | Prolog | Python | Rexx | Ruby | Scala | Scheme | Seed7 | sh | Smalltalk | SQL | Standard ML | SVG | Tcl | Tcl Tk | Visual Basic | Visual Basic .NET | XSL
This article describes a small IBM PC bootstrap to flat 32-bit protected mode — just enough to display the classic "Hello, World!".
Contents |
theory
Ontogeny recapitulates phylogeny — Ernst Haeckel
IBM PCs have traditionally recapitulated their development, powering up in a mode very similar to that of an 1981-era machine, then enabling the "recently acquired" features during the bootstrap process.
practice
32 bit protected mode application
As expected, there is not much to Hello World itself:
<<hello.c>>= extern void cls(), at(); void rawmain() { cls(); at(32,12,"Hello, world!"); }
device driver
Not having stdio
available, we must provide a some kind of device driver for output. The PC boots into a text display mode with a memory-mapped screen buffer at a fixed address, so we simply use C to place the required data onto the screen.
<<screen.c>>= #define VIDEOMEM (char *)0xb8000 #define SCRX 80 #define SCRY 25 #define ATTRIB 0x71 /* blue on grey */ void cls() { char *p = VIDEOMEM; int n; for(n = 0; n < SCRX*SCRY; ++n) { *p++ = ' '; *p++ = ATTRIB; } } void at(int x, int y, char *m) { char *p = VIDEOMEM + 2*(x+SCRX*y); while(*m) { *p++ = *m++; *p++ = ATTRIB; } }
runtime library
In order to call the functions above, we must first have a working stack. This assembly code initializes the stack and some of the segment descriptors so the C code will run in the proper environment. Traditionally, start is the assembly-language entry point to a C program, but here we will just begin with the first instruction of the binary application, and arrange the link so this code appears in the right place.
<<crt0.s>>= xor %eax, %eax # boot sets CS,DS,ES,SS mov %eax, %fs mov %eax, %gs mov $0x20000, %esp # set up a stack call _rawmain # and enter the C code
The C program might return, but we don't have any continuation at this point. In the absence of better options, we wait for a keypress (with a small polling device driver), then reboot.
<<crt0.s>>= wkbd: mov $0x64, %edx # if it returns, wait for any keypress inb %dx, %al andl $1, %eax jz wkbd mov $0x64, %edx # then reboot the machine mov $0xfe, %eax outb %al, %dx cli loop: jmp loop # (or at least hang)
Exercise: implement exit()
16 bit real mode bootstrap
Now, if the PC starts out in 16-bit ca. 1981 real mode, how do we establish a large flat address space for the C application? The PC looks for a boot sector on the peripherals when it starts up, and while hard drives work slightly differently, CD-ROMS and USB keys (neither of which were available in 1981) can be viewed as floppy-compatible. The boot sector code must be only a few hundred bytes, but this suffices to load additional code and enable a more recent CPU configuration. We will use debug to assemble this (largely 8086 compatible) bootstrap code.
resources
First, a few pieces of data:
- the application itself is, to the boot sector, just data to be loaded. We will place it in the two sectors following the boot sector.
- each floppy carries some metadata in its boot sector.
Exercise: Here we copy the data from a disk image which has been formatted by FREEDOS — figure out a better way of providing, or better yet, calculating, this information
Exercise: Unfortunately this approach, while it appears to have a FAT filesystem, does not respect it, and can be easily corrupted — rearrange the disk image so that it is useful both as a FAT disk and as boot disk
<<boot.src>>= f 100,700 0 a ;;;; first we bring in our C program (300-700) ;;;; n a.bin l 300 a ;;;; then the floppy parameter table (103-13F) ;;;; ; parameters taken from a FREEDOS formatted 1.44M ; e 103 46 52 44 4F 53 e 108 34 2E 31 00 02 01 01 00 e 110 02 E0 00 40 0B f0 09 00 e 118 12 00 02 00 00 00 00 00 e 120 00 00 00 00 00 00 29 12 e 128 15 5D 33 45 4D 50 54 59 e 130 44 49 53 4B 20 20 46 41 e 138 54 31 32 20 20 20 31 C0
disk input
While we are in 16-bit real mode, we can take advantage of having the BIOS to read the application off the drive. The BIOS loads the boot code at 0000:7C00-0000:7E00, so we load the application immediately following, at 0000:7E00-0000:8200.
<<boot.src>>= a ;;;; now we assemble the boot sector ;;;; a 100 jmp 140 ; skip over the floppy parameters a 140 ; == floppy I/O == xor dx,dx mov es,dx mov cx,02 mov bx,7e00 mov ax,0202 int 13 ; read the next (al=2) sectors... mov dx,03f2 xor ax,ax out dx,al ; ... then turn off the floppy jmp 0:7c80
(the far jump is just to ensure that we are entering the next code with CS:IP of 0000:7C80, not 007C:0080 or some other combination of selectors)
At this point we are done with the BIOS and ready to switch to protected mode — we ask the Sherpas to turn back, and make a dash for the summit.
mode switch
Debug, while still available on XP, was originally an 8086 application. Luckily, the machine does not care if its instructions were assembled with mnemonics, so we enter some of the more unusual instructions as strings of data. (at least we are not toggling them in on front panel switches)
<<boot.src>>= a 180 ; == enter protected mode == cli ; turn off interrupts ; load GDT (using CS override) db 2e, 0f, 01, 16, 00, 7d ; set pmode bit in cr0 db 0f, 20, c0 or al, 1 db 0f, 22, c0 ; load 32-bit data segment registers, and... mov dx, 10 mov es, dx mov ds, dx mov ss, dx ; ...far jump to load 32-bit code segment jmp 18:7e00
If all has gone well, we have loaded the application to 0000:7e000, which is identically 00007e000 in the flat mapping, and this last jump will take us into the runtime library startup code from crt0.s. If not, the machine will, at best, reboot — or, more likely, execute some random code — caveat lusor!
tables
We can't turn off segmentation in IA32, but we can effectively avoid it by providing a Global Descriptor Table setting each segment to an identity map over the entire address-space.
00: always null -- the space here is used for the descriptor needed by lgdt 08: (unused) 10: data segment (0-4G) 18: code segment (0-4G)
These descriptor tables are themselves backwards compatible with 16-bit descriptor tables, so the bitfields are split up in odd ways. Here we treat the entire table as an opaque resource, and again just enter the required data rather than worrying about how to construct it.
<<boot.src>>= a 200 ; // GDT // db 1f, 00, 00, 7d, 00, 00, 00, 00 db 00, 00, 00, 00, 00, 00, 00, 00 db ff, ff, 00, 00, 00, 92, cf, 00 db ff, ff, 00, 00, 00, 9a, cf, 00 a 2fe ; // signature for boot record // db 55, aa
output
Having built an in-memory image of the boot sector and the following 2 application sectors, we save it to produce boot.bin.
<<boot.src>>= a 300 ;;;; finally, we produce the binary (3 sectors) ;;;; n boot.bin r cx 600 w q
Question: suppose the size of the application changes. What numbers above must be modified?
wrapping up
Finally, we set up compilation
- without any standard libraries (they are meant for the host environment, not this one)
- mapping code and data to where they will be loaded by the boot sector, at hex addresses 00007e00 and 00007f00 respectively
- and arranging for crt0.obj to be located at the start of the text section
<<build.bat>>= set OBJS=a.exe a.bin set SRCS=crt0.s hello.c screen.c set ARGS=-nostdlib -Wl,-Ttext -Wl,0x7e00 -Wl,-Tdata -Wl,0x7f00
and give a batch file to build boot.bin with the GNU toolchain and Microsoft's debug.
<<build.bat>>= del *.bin gcc %ARGS% %SRCS% objcopy -O binary a.exe a.bin if not exist a.bin goto err debug < boot.src del %OBJS%
Optional features:
- Download Image:Boot floppy tail.img to produce a 1.44 Mb floppy image
- Download Image:bochsrc.bxrc to use with bochs in software-emulation.
<<build.bat>>= if not exist tail.img goto end copy /b boot.bin + tail.img 144.img del boot.bin if not exist bochsrc.bxrc goto end bochsrc.bxrc goto end :ERR @echo off echo - echo - echo gcc and objcopy failed or missing from PATH pause :END
Download code |