In the previous section of this tutorial for writing your own toy operating system, we discussed memory and focused on the 21st address line (the “A20 line”) that must be enabled before we can have access to the full 4GB of memory, which is a prerequisite to entering protected mode. Now it’s time to jump to protected mode.

In fact, all we’ve done in the last few articles is prepare for entering protected mode. We’ve set up a global descriptor table (GDT), an interrupt descriptor table (IDT) and enabled the A20 line. All that remains is actually jumping to protected mode where we’ll finally be able to execute 32-bit code so we can focus on our kernel.

This article is part of a series on toy operating system development.

View the series index

In the previous section of this tutorial for writing your own toy operating system, we discussed memory and focused on the 21st address line (the “A20 line”) that must be enabled before we can have access to the full 4GB of memory, which is a prerequisite to entering protected mode. Now it’s time to jump to protected mode.

In fact, all we’ve done in the last few articles is prepare for entering protected mode. We’ve set up a global descriptor table (GDT), an interrupt descriptor table (IDT) and enabled the A20 line. All that remains is actually jumping to protected mode where we’ll finally be able to execute 32-bit code so we can focus on our kernel.

This article is part of a series on toy operating system development.

View the series index

Control registers

What we’ve seen so far while rolling our own first-stage and second-stage boot loaders, is familiar processor registers: AX, BX, CX, DX, segments like CS, DS, ES, SS, the instruction pointer IP and the stack pointer SP. The 80386+ processors actually introduce some new registers what will become important when we switch to 32-bit programming.

For one thing, existing registers get wider. Where we used to have access to AX (16 bits wide), we will soon have access to EAX (32-bits wide), as well as EBX, ECX and EDX. We’ll gain additional segment registers as well (FS and GS). Similarly, the instruction pointer becomes EIP (32 bits again) and so on. That’s great, and requires no great deal of explanation.

However, we gain other registers as well. The Intel 80386 processor comes armed with a set of control registers, and we’ll need one of them to switch to protected mode so we might as well talk about it now. These control registers change or control the behavior of the CPU. This includes interrupt control, switching addressing mode, paging and coprocessor control. The new registers are called CR0, CR1, CR2, CR3 and CR4.

The first control register, CR0, has various control flags that modify the basic operation of the processor.

BitNameFull nameDescription
31PGPagingIf 1, enable paging and use the CR3 register, else disable paging
30CDCache disableGlobally enables/disable the memory cache
29NWNot-write throughGlobally enables/disable write-back caching
18AMAlignment maskAlignment check enabled if AM set, AC flag (in EFLAGS register) set, and privilege level is 3
16WPWrite protectDetermines whether the CPU can write to pages marked read-only
5NENumeric errorEnable internal x87 floating point error reporting when set, else enables PC style x87 error detection
4ETExtension typeOn the 386, it allowed to specify whether the external math coprocessor was an 80287 or 80387
3TSTask switchedAllows saving x87 task context upon a task switch only after x87 instruction used
2EMEmulationIf set, no x87 floating point unit present, if clear, x87 FPU present
1MPMonitor co-processorControls interaction of WAIT/FWAIT instructions with TS flag in CR0
0PEProtected mode enableIf 1, system is in protected mode, else system is in real mode

Some of these bits will become important for us later on, but for now, we’re interested in the very first bit: the PE bit. It enables protected mode.

Switching to protected mode

In order to make the switch to protected mode, all we have to do is enable the PE-bit in the CR0 register, like so:

.macro mGoProtected
  mov    eax, cr0
  or     eax, 1
  mov    cr0, eax
.endm

Clearing the prefetch queue

By setting the PE bit in the CR0 register, we have just switched to protected mode. This means that all instructions are now in 32-bit format. As a result, some of them are encoded differently. Some instructions may take up more bytes in their binary form, some others maybe less, and other still remain unchanged. At any rate, we can’t continue executing any more code just yet, because of the prefetch queue.

You see, CPUs are built to be fast. One of the tricks of the trade that make CPUs ever faster is to have the CPU load a range of instructions from memory to be executed at the same time, rather than just one. This is called prefetching. After all, the CPU in the Intel 80386 processor can read 4 bytes (32 bits) at the same time from memory, and that might well be more than one instruction. For technical reasons, even more might be read and decoded before it’s actually executed by the CPU.

The consequence of this is that the CPU may have read some instructions from memory when it was still in 16-bits mode, decoded them, and is now ready to execute them. They won’t work, because the processor is now in protected 32-bits mode!

Luckily, there is trick to make the processor discard the instructions it has already prefetched, and that trick is jumping. Whenever the processor encounters a jump instruction, any instructions it had read past that instructions become worthless and must be discarded. Consequently, jumping clears the prefetch queue:

.macro mClearPrefetchQueue
    jmp clear_prefetch_queue:
    nop
    nop
  clear_prefetch_queue:
.endm

There are some nop instructions after the jump, to make doubly sure that the prefetch queue is fully emptied.

Setting up the 80386’s registers

We’ve talked the talk, now the time has come to walk the walk. Next, we’ll set up the memory segments that our future kernel code will use. This is no longer done by putting in memory addresses, but by specifying selector numbers. We’ll set all our data segments (ds, es, fs and gs) as well as the stack segment (ss) to use selector 2 from the global descriptor table, which corresponds to the data segment that we had defined in our GDT:

.macro mSetup386Segments
    mov    ax, 0x10      # Byte offset for selector 2
    mov    ds, ax        # (remember, each descriptor is 8 bytes)
    mov    es, ax
    mov    fs, ax
    mov    gs, ax
    mov    ss, ax
    mov    esp, 0x2ffff  # Set stack to grown downwards from 0x30000
.endm

Jumping to the kernel

Yes! Assuming that our second-stage boot loader had previously loaded our kernel image into memory at linear address 0x20000 (using the same FAT reading at file reading routines we had already developed for the first-stage bootloader), we can now jump to it and start executing it.

The jump to the kernel must be done with a 32-bit long jump instruction. Here we face a small snag. All the code in our second-stage boot loader is 16-bit code, because that’s the way it’s compiled. Therefore, we cannot actually specify a 32-bit long jump; it will get compiled as a 16-bit jump. To get around this, we’ll encode the long jump instruction ourselves just like a 32-bit assembler would do it.

Our long jump instruction will jump to linear memory address 0x20000, in the first selector of the GDT (our code segment), which has offset 0x8:

.macro mJumpToKernel
  .byte 0x66
  .byte 0xEA
  .int  0x20000            # offset
  .word 0x0008             # selector word
.endm

This will transfer control to the kernel code, which we have yet to write. If you’re feeling adventurous, why not write a small 32-bit assembly program that places the value 0x41 at linear address 0xb8000? That will show the letter “A” at the top-left corner of the screen and can be executed in protected mode (you can’t use the BIOS interrupts to write to the screen anymore).

Actually, we’ll do that in the next part of this tutorial anyway!

Summary

Whew! It’s been quite a trip, but we have now reached protected mode and are ready to write a simple kernel. At least at this point, all of the machine’s memory and protected mode features will be at our disposal.

In this tutorial, we’ve wrapped up the final bits necessary to enter protected mode:

  • We’ve enabled the PE-bit in the CR0 register, thus switching to protected mode
  • We’ve cleared the prefetch queue so that no 16-bit instructions remained in the CPU which can no longer be executed
  • We’ve setup the registers for use by the 32-bit kernel program
  • We’ve executed a long jump to the kernel code

Continue on to the next part of this guide!