If you’re trying to compile a kernel written in C for your own toy operating system, you may run into trouble compiling/linking your code. Assuming you’re using GRUB to load your kernel, or you’ve rolled your own boot sector, you’ll now want to compile your kernel code (written in C) to a flat binary. The toolchain provided by MinGW (gcc and ld) is well suited for this, as long as you know a few tricks.

This article is part of a series on toy operating system development.

View the series index

Let’s start with a very simple kernel.c program just to see if we can get things working:

int main(void)
{
mylabel:
  goto mylabel;
}

We’ll compile this with gcc, switching on all warnings (the compiler is our friend):

$ gcc -Wall -pedantic-errors kernel.c -o kernel.exe

This will yield a working program that we can actually execute at the command prompt. It’ll pause indefinitely, as desired. However, there are a number of problems with the resulting binary:

First, the binary includes a PE header, which specifies how Windows must load and execute the program. We’re writing a kernel, so we don’t want any of this header data. We must find it way to remove it.

Second, the program is relocatable. The operating system (i.e. Windows) will load the code into memory where it wants, then use the information contained in the PE header to make sure that all references are correct. The references are provided relatively, that is, the can be relocated. For our kernel, this is not what we want: we want to load our kernel at a specific address (say 0x20000) and make all references work precisely (statically) there.

This can be illustrated by running objdump:

$ objdump -f kernel.exe
kernel.exe: file format pei-i386
architecture: i386, flags 0x00000132:
EXEC_P, HAS_SYMS, HAS_LOCALS, D_PAGED
start address 0x00401160

Objdump’s output shows that a PE header is present (pei-i386 file format) and that a default random start adress of 0x00401160 has been defined. Let’s see what we can do about the start address. Since we want our kernel to always run at 0x20000, we can instruct the linked to use that address to place the code. Linker options can be passed to gcc:

Hint: do not use gcc to compile but not link, then ld to do the linking separately. Strange error messages will ensue. It’s easier to simply pass the linking options to gcc and let gcc call ld for you.

$ gcc -Wall -pedantic-errors kernel.c -o kernel.exe -Wl,-Ttext=0x20000
$ objdump -f kernel.exe
kernel.exe: file format pei-i386
architecture: i386, flags 0x00000132:
EXEC_P, HAS_SYMS, HAS_LOCALS, D_PAGED
start address 0x00020160

Oh look: our start address is now 0x00020160. The excess 0x160 bytes are the space occupied by the header, which we don’t want. We can try to pass the option –oformat binary to the linker, which will make it link a flat binary for us. Unfortunately (under MinGW), we get this:

c:/mingw/bin/../lib/gcc/mingw32/4.5.2/../../../../mingw32/bin/ld.exe:
cannot perform PE operations on non PE output file 'kernel.exe'.
collect2: ld returned 1 exit status

This can be resolved though: let the linker create the kernel.exe executable, then pass it through objcopy to create the flat binary:

$ objcopy -O binary -j .text kernel.exe kernel.bin

This will yield, finally, an executable. Unfortunately, it’s 3376 bytes in size! About 10 bytes would be closer to the mark. Obviously, code is being included that we didn’t write: references to standard libraries. Since we don’t have any standard libraries in our fledgling operating system, we’ll need to remove this. This can be done by passing the -nostdlib argument to gcc:

$ gcc -Wall -pedantic-errors kernel.c -o kernel.exe -nostdlib -Wl,-Ttext=0x20000
C:\Users\AppData\Local\Temp\cc5nshHf.o:kernel.c:(.text+0x7):
  undefined reference to `__main'
collect2: ld returned 1 exit status

Foiled again! Now that we have no standard libraries, ld is looking for startup code that doesn’t exist. We did write a main function, but it’s actually looking for a wrapper to that main function normally supplied by the standard libraries. Let’s try a different approach: we’ll rename our main function.

int start(void)
{
mylabel:
  goto mylabel;
}

Now our code compiles, and we’re down to a flat binary of 2011 bytes. It turns out that we must also pass -nostdlib to the linker:

$ gcc -Wall -pedantic-errors kernel.c -o kernel.exe -nostdlib
  -Wl,-Ttext=0x20000,-nostdlib

Now we get an executable of 24 bytes. In fact, on my system I get:

00000000h: 55 89 e5 eb fe 90 90 90 ff ff ff ff 00 00 00 00
00000010h: ff ff ff ff 00 00 00 00

When disassembled, this yields:

push ebp
mov ebp, esp
jmp .-2

This corresponds exactly to the code we wrote: a stack frame is created for the start function (even though we are not interested in it – a C program must always start with a function), then an infinite loop is entered (which we wrote using a label and a goto statement).

Wait… this code only occupies 5 bytes. So why are there 24 bytes in the flat binary image? We can see that the first three unneeded bytes have a value of 0x90, which corresponds to NOP instructions. This is probably added to get at least an 8-byte boundary. However, why an additional 16 bytes are added, I actually don’t know. If anyone can explain, I’d be grateful.

Nevertheless, we have now produced a flat binary that can be launched by our boot sector or second stage boot loader. It can be placed at 0x20000 and includes no undesired headers. Just the code, please, ma’am.