When you’re writing your own toy operating system, the first thing you’ll need is a boot sector. It’s a piece of code (the boot loader) that lives in the first sector of a (floppy) disk. This code gets called by the BIOS as soon as the computer starts up, and is responsible for setting everything up for your operating system’s kernel to be loaded and executed.
This article is part of a series on toy operating system development.
Note that you can actually start developing other components of your toy operating system before writing boot code, since you can use GRUB (GNU Grand Unified Boot Loader) or LILO to start your kernel. Using one of these tools brings advantages, since they’ll switch the processor to protected mode for you, and allow you to load kernels that are placed beyond cylinder 1024 of a hard disk.
However, writing your own boot code can be a very interesting exercise in assembly programming, and you’ll have full control over what your boot loader actually does. Plus, you get to try and do it better than the people who wrote the DOS/Win95 boot loaders (which isn’t saying a lot as you’ll see below).
Boot loader requirements
The boot code lives in the first sector of a floppy disk, which typically has a size of 512 bytes. However, 61 of those bytes are occupied by data, placed on the disk when it is formatted. This data includes the size of a disk sector, number of FAT tables, number of tracks per sector, volume ID, and more. This yields 451 bytes available for code, which is not a whole lot. That’s one reason we’ll use assembler to write our code.
The DOS/Windows bootloader and its limitations
Let’s consider the boot loader that most of us have used many times: the boot loader that comes with DOS or Windows (up to Windows 95). What does it do?
- Reset the floppy disk system
- Read the first sector of the root directory from the disk
- Verify that the first file found there is IO.SYS (the kernel)
- Load IO.SYS into memory
- Transfer control to IO.SYS
Since the space available for actual code in the boot sector is limited, the author of the DOS boot loader introduced an important requirement: the file IO.SYS must be the first file in the root directory. The DOS code does not scan the entire root directory looking for the required file. If IO.SYS is not the first file found, then the boot code fails.
This is why DOS/Windows comes with the SYS.COM program, which is used to make a disk bootable. This program actually cleans the root directory of a floppy disk and copies IO.SYS into it as the first entry, effectively removing all the other files. It would have been much nicer if it had been possible to copy IO.SYS to the root directory of a disk, at any position. Then any disk could be make bootable without sacrificing the files on it. This can actually be done, but it requires more assembly code, something the DOS developers apparently did not find any space for – but we can do better.
At any rate, modern operating systems will switch the processor to protected mode, which allows us to address up to 4 GB of memory in a flat model (not segmented), and switch on paging to protect processes from one another. This wasn’t part of the DOS/Windows 95 boot loader, but we’ll need to do it.
How a boot loader gets called
When the computer starts up, it executes a power-on self test (POST). It then performs the following actions:
- Determine which device (drive) to use for booting, using preferences stored in the CMOS.
- Try to load the first sector (and only the first sector) from the boot drive into memory at address
- Verify that the the first sector is in fact bootable by checking for the presence of a magic number (see below).
- Store the number of the drive used in register
- Point the CPU’s instruction pointer to
0:0x7C00, and start execution from there.
The computer knows how to do these things, and does them automatically, because the code for this is in its BIOS ROM. In other words, these procedures we get for free with any computer.
What a boot loader should do
Here’s a list of things that a modern boot loader should do in order to load and start your operating system’s kernel (we’ll cover concepts like the A20-line, IDT and GDT tables later):
- Reset the floppy disk system
- Write a “loading” message to the screen
- Find the kernel in the root directory of the disk (at any position)
- Read the kernel from disk into memory
- Enable the A20-line
- Setup the IDT and GDT tables
- Switch to protected mode
- Clear the processor prefetch queue
- Run the kernel
Boot Sector Layout
The boot sector of a floppy disk has a very specific layout, because the BIOS requires access to certain data which it needs to find in the place it expects it to be. Also, an operating system will need to access this data to determine how large the disk is, what file system it uses, what its volume label is and so on. For this article, we’ll assume a floppy disk formatted with a FAT16 file system. The layout of the boot sector is then:
|0000||3||Code||Jump to rest of code|
|0011||2||Bytes per sector||512|
|0013||1||Number of sectors per cluster||1|
|0014||2||Number of reserved sectors||1|
|0016||1||Number of FAT tables||2|
|0017||2||Number of root directory entries (usually 224)||224|
|0019||2||Total number of sectors||2880|
|0022||2||Number of sectors per FAT||9|
|0024||2||Number of sectors/track||9|
|0026||2||Number of heads||2|
|0028||2||Number of hidden sectors||0|
|0030||2||EBPB||Number of hidden sectors (high word)||0|
|0032||4||Total number of sectors in filesystem|
|0036||1||Logical drive number||0|
A required element of the boot sector is the boot parameter block (BPB) and the extended boot parameter block (EBPB, for FAT16). This block must be placed at offset 3, size 59 bytes. Also, the boot sector must end with the magic number
0xaa55: (some) BIOSes will check whether this value is present at offset 510. If not, the BIOS will refuse to boot from the disk. All other bytes are available for us to fill in. We can calculate that that adds in fact up to 451 bytes. Also, the first three bytes are separated from the rest and should only be used to jump to the rest of the code, so that’s less 3 bytes for interesting code…
Here is a typical hex dump of a boot sector without any code. Note the BPB and EBPB as described above, from position 0x3 to position 0x3f, and the magic number at the end. All other bytes are available for code.
This article described how a computer bootstraps. The Power-On Self Test (POST) causes the first sector of a (floppy) disk to be read into memory. This boot sector contains information about the disk and (at most) 451 bytes of code. In the next part of this guide, we’ll see how we can write assembly code to roll our own boot loader.