If you’re trying to compile a kernel written in C for your own toy operating system, you may run into trouble compiling/linking your code. Assuming you’re using GRUB to load your kernel, or you’ve rolled your own boot sector, you’ll now want to compile your kernel code (written in C) to a flat binary. The […]

This article was posted by Independent Software, a website and database application development company based in Maputo, Mozambique. Our website offers regular write-ups on technical and design issues, ranging from details at code level to 3D Studio Max rendering. Read more about Independent Software's philosophy, or get in touch with Independent Software.

If you’re trying to compile a kernel written in C for your own toy operating system, you may run into trouble compiling/linking your code. Assuming you’re using GRUB to load your kernel, or you’ve rolled your own boot sector, you’ll now want to compile your kernel code (written in C) to a flat binary. The toolchain provided by MinGW (gcc and ld) is well suited for this, as long as you know a few tricks.

Let’s start with a very simple kernel.c program just to see if we can get things working:

We’ll compile this with gcc, switching on all warnings (the compiler is our friend):

This will yield a working program that we can actually execute at the command prompt. It’ll pause indefinitely, as desired. However, there are a number of problems with the resulting binary:

First, the binary includes a PE header, which specifies how Windows must load and execute the program. We’re writing a kernel, so we don’t want any of this header data. We must find it way to remove it.

Second, the program is relocatable. The operating system (i.e. Windows) will load the code into memory where it wants, then use the information contained in the PE header to make sure that all references are correct. The references are provided relatively, that is, the can be relocated. For our kernel, this is not what we want: we want to load our kernel at a specific address (say 0x20000) and make all references work precisely (statically) there.

This can be illustrated by running objdump:

Objdump’s output shows that a PE header is present (pei-i386 file format) and that a default random start adress of 0x00401160 has been defined. Let’s see what we can do about the start address. Since we want our kernel to always run at 0x20000, we can instruct the linked to use that address to place the code. Linker options can be passed to gcc:

Hint: do not use gcc to compile but not link, then ld to do the linking separately. Strange error messages will ensue. It’s easier to simply pass the linking options to gcc and let gcc call ld for you.

Oh look: our start address is now 0x00020160. The excess 0x160 bytes are the space occupied by the header, which we don’t want. We can try to pass the option –oformat binary to the linker, which will make it link a flat binary for us. Unfortunately (under MinGW), we get this:

This can be resolved though: let the linker create the kernel.exe executable, then pass it through objcopy to create the flat binary:

This will yield, finally, an executable. Unfortunately, it’s 3376 bytes in size! About 10 bytes would be closer to the mark. Obviously, code is being included that we didn’t write: references to standard libraries. Since we don’t have any standard libraries in our fledgling operating system, we’ll need to remove this. This can be done by passing the -nostdlib argument to gcc:

Foiled again! Now that we have no standard libraries, ld is looking for startup code that doesn’t exist. We did write a main function, but it’s actually looking for a wrapper to that main function normally supplied by the standard libraries. Let’s try a different approach: we’ll rename our main function.

Now our code compiles, and we’re down to a flat binary of 2011 bytes. It turns out that we must also pass -nostdlib to the linker:

Now we get an executable of 24 bytes. In fact, on my system I get:

When disassembled, this yields:

This corresponds exactly to the code we wrote: a stack frame is created for the start function (even though we are not interested in it – a C program must always start with a function), then an infinite loop is entered (which we wrote using a label and a goto statement).

Wait… this code only occupies 5 bytes. So why are there 24 bytes in the flat binary image? We can see that the first three unneeded bytes have a value of 0x90, which corresponds to NOP instructions. This is probably added to get at least an 8-byte boundary. However, why an additional 16 bytes are added, I actually don’t know. If anyone can explain, I’d be grateful.

Nevertheless, we have now produced a flat binary that can be launched by our boot sector or second stage boot loader. It can be placed at 0x20000 and includes no undesired headers. Just the code, please, ma’am.

Trackbacks

  1. Writing your own bootloader for a toy operating system (3) | Websofia
  2. Writing your own toy operating system: Setting up a toolchain and using Bochs | Independent Software

Comments

5 5 Responses to “Linking a flat binary from C with MinGW”
  1. Verdell says:

    I just added this site to my google reader, great stuff. Can not get enough!

  2. Irvan says:

    I get 528 bytes after executing objcopy. But it can be fixed by adding these options to the linker
    –section-alignment 1 –file-alignment 1 🙂

  3. Paolo says:

    PE stores global constructors and destructors at the end of the .text section, using the following linker script snippet:

    ___CTOR_LIST__ = .; __CTOR_LIST__ = . ;
    LONG (-1);*(.ctors); *(.ctor); *(SORT(.ctors.*)); LONG (0);
    ___DTOR_LIST__ = .; __DTOR_LIST__ = . ;
    LONG (-1); *(.dtors); *(.dtor); *(SORT(.dtors.*)); LONG (0);

    Each of the LONG directives adds 4 bytes. You have to pass a linker script like this with the -t option:

    SECTIONS { . = 0x20000; .text : { *(.text) *(.text.$) } }
    ENTRY(start)

    to remove those 16 bytes.

Leave a Reply

Your email address will not be published. Required fields are marked *