I've recently become interested in how the i386 boot loader works. There is an excellent example of a boot loader here, and another here. Some simple protected-mode code which implements a kernel capable of writing a line of text, is here. FemtoOS, is the culmination of combining code from all those, and building a simple boot loader plus a protected mode kernel, that outputs text. In my case, I'm compiling the kernel on FreeBSD.
The bootloader is in boot.asm. Like any i386 boot loader, it's loaded by the BIOS at address 0xC700, in 16-bit real mode.
It starts by clearing the screen, outputting a message, and then loading kernel.bin from the disk. kernel.bin is loaded at address 0x1000. The bootloader then enters protected mode, sets up a GDT and passes control to kernel.bin at address 0x1000. Looking at the documentation for BIOS interrupt 13 here, it's clear that a single sector has 512 bytes. boot.bin is exactly 512 bytes long and the floppy image was created using this code from Makefile. Therefore kernel.bin starts at the second sector on the disk.
cat boot.bin kernel.bin /dev/zero | dd bs=512 count=2880 of=floppy.img
This code from boot.asm reads 18 sectors, starting at sector 02, into RAM at 0x1000. The largest kernel.bin can be, therefore is 512*18=9KB.
mov ax, 0 mov es, ax mov bx, 0x1000 ; Destination address = 0000:1000 mov ah, 02h ; READ SECTOR-command mov al, 12h ; Number of sectors to read (0x12 = 18 sectors) mov dl, [drive] ; Load boot disk mov ch, 0 ; Cylinder = 0 mov cl, 2 ; Starting Sector = 3 mov dh, 0 ; Head = 1 int 13h ; Call interrupt 13h
kernel.bin is linked from the object files created from loader.asm, main.c and video.c. When the bootloader passes control to kernel.bin, it starts at loader.asm which in turn passes control to main() from main.c. Note that while boot.asm contains both 16-bit and 32-bit code, loader.asm contains only 32-bit code; it is called after boot.asm has put the host in 32-bit protected mode.
This code sets up the GDT
xor ax, ax ; Clear AX register mov ds, ax ; Set DS-register to 0 - used by lgdt lgdt [gdt_desc] ; Load the GDT descriptor
and this code puts the machine into protected mode, followed by passing control to loader.asm. Note that immediately after putting the machine into protected mode a jmp instruction is issued to the label "kernel_segments" which is the first 32 bit instruction executed on boot. From this point on, we are in 32 bit protected mode.
mov eax, cr0 ; Copy the contents of CR0 into EAX or eax, 1 ; Set bit 0 (0xFE = Real Mode) mov cr0, eax ; Copy the contents of EAX into CR0 jmp 08h:kernel_segments ; Jump to code segment, offset kernel_segments [BITS 32] ; We now need 32-bit instructions kernel_segments: mov ax, 10h ; Save data segment identifyer mov ds, ax ; Move a valid data segment into the data segment register mov ss, ax ; Move a valid data segment into the stack segment register mov esp, 090000h ; Move the stack pointer to 090000h jmp 08h:0x1000 ; Jump to section 08h (code), offset 01000h
The code in loader.asm that call's the C code main() is pretty simple:
start: call main ; Call our kernel's main() function
There is good documentation for the protected mode text console here. The simple implementation of this is in video.c.
There is a floppy disk image here, which boots in both qemu and VMWare.