How to write you first kernel

Prerequisites

This post is aimed at programmers that are interested in, or at least curious about, kernel development.

It does require knowledge of computer architectures, C and assembly (regardless of the architecture). You’ll also need to understand the compilation pipeline a program written in C goes through, particularly the compilation and linking phases. If this is not the case, here’s an article that can help you learn about it.

You will also need to be knowledgeable about the ELF format, or at least about sections.

No prior kernel development experience is required. This guide is here to unravel the intricacies of x86 kernel development, breaking down the process into manageable pieces for beginners.

Introduction

The goal of this article is to go through the first steps every kernel programmer went through, and make our computer write a simple message to the screen.

We will walk together through the first few steps of this adventure, setting up the environment, booting for the first time and interacting with your first peripheral: the computer’s screen.

But first, what does a kernel do exactly?

According to osdev:

Its responsibilities include managing memory and devices and also providing an interface for software applications to use those resources

Kernel/Hardware relationship

There exist multiple types of kernels for all architectures, ranging from the most common ones such as IA64 or ARM, to lesser-known ones like MIPS. Some kernels even support multiple architectures, the most famous example of this this being the Linux kernel. To keep things easy for us, we will be using a standard and “easy” architecture, commonly used by beginners: Intel’s 32bit x86.

Cross Compilation

If you know about compilation and the different computer architectures, you know that an executable compiled on a given computer won’t necessarily work on someone else’s machine (without taking into account other aspects of the compilation process such as libraries). This is because the instruction sets used by both computers can be different. But what to do if I want to compile my program, let’s say on my brand new Intel i5 processor, for my friend that uses the latest M3 Macbook?

One solution: Cross-Compilation

We can split the previous situation into 2 distinct parts: the computer we compile the program on (HOST), and the one on which the program will be excuted (TARGET). To solve the problem, we simply need a compiler, that can be executed on the HOST machine and that produces a binary executable on the TARGET computer. That’s exactly what cross-compiling a program is.

Cross-Compilation overview

In this article, we’ll be using GCC, which does not support cross-compiling by default. We will need to build our own version of it, runnable on our machine (x86-64 in my case), and that produces code for the x86 architecture.

Compiling your cross-compiler

The first step for building our own GCC cross-compiler is building the toolchain used by our custom GCC, namely binutils. Similarly to the cross-compiler, the toolchain will be runnable on our system, but handle code for another architecture.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


# Setup
export PREFIX="$HOME/opt/cross"
export TARGET="i686-elf"
export PATH="$PREFIX/bin:$PATH" # Add the result of our compilation to the path

# Get our own copy of the source code
git clone git://sourceware.org/git/binutils.git
cd binutils
./configure --target=$TARGET --prefix="$PREFIX" --with-sysroot --disable-nls --disable-werror
make
make install

Now that we have the correct toolchain available in our PATH, we can finally compile our own cross-compiler.

1
2
3
4
5
6
7


git clone git://gcc.gnu.org/git/gcc.git gcc
cd gcc
./configure --target=$TARGET --prefix="$PREFIX" --disable-nls --enable-languages=c,c++ --without-headers
make all-gcc
make all-target-libgcc # GCC expects this library to be made available at compile-time
make install-gcc
make install-target-libgcc

And here we go ! We now have our own cross-compiler for our desired architecture!

Booting Into My kernel

As for any regular kernel, being Windows’ NT or Linux, the first step after booting our PC is to find the kernel’s code and execute it. This is the responsibility of the bootloader.

For our kernel to be bootable we need to make it so that GRUB (our bootloader) can detect it and run it correctly.

Thankfully for us, there exists a common standard for bootable programs: multiboot.
To make it simple, if our bootable piece of software defines a .multiboot section, containing information about the executed code, any Multiboot-compliant bootloader (that supports this convention), will be able to boot it. This includes GRUB.

So … let’s define these headers inside a first assembly file:

1
2
3
4
5
6
7
8
9


#include "mulitboot.h"

#define MULTIBOOT_HEADER_FLAGS (MULTIBOOT_PAGE_ALIGN | MULTIBOOT_MEMORY_INFO)

.section .multiboot
.align MULTIBOOT_HEADER_ALIGN
.long MULTIBOOT_HEADER_MAGIC
.long MULTIBOOT_HEADER_FLAGS
.long -(MULTIBOOT_HEADER_FLAGS + MULTIBOOT_HEADER_MAGIC)

The included multiboot.h header is available here

If you’re curious about what these values mean, you can check out the official reference.

Once a bootable program has been compiled with these headers, we can create an ISO out of it.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18


KERNEL_BIN="kernel.bin" # Our bootable executable
KERNEL_ISO="kernel.iso"

echo "Generating $KERNEL_ISO from $KERNEL_BIN ..."

# Create a minimal boot partition for grub to be able to generate an iso file
mkdir -p ./iso/boot/grub
cp "$KERNEL_BIN" ./iso/boot

# Add a custom multiboot entry for grub to be able to boot our kernel
cat <<EOF > ./iso/boot/grub/grub.cfg
menuentry "Kernel - kernel" {
    multiboot /boot/$KERNEL_BIN
}
EOF

# Create the ISO file
grub-mkrescue -o "$KERNEL_ISO" ./iso

We can now start a virtual machine, booting onto this freshly generated iso file. For that, we will use qemu, an emulator available on linux that supports a lot of different hardwares, architectures and peripherals, including ours.

1

qemu-system-i386 -cdrom ./kernel.iso`

QEMU preview

A first entry point

Now that we know how to boot into an already existing program, what’s left to do is … well, actually writing our kernel.

When we compile an executable on Linux, the compiler automatically generates the necessary sections during the linking phase. Unfortunately, this is not the case when compiling our own kernel. We now have to define these different sections, as well as the entry point for our kernel’s code ourselves.

When compiling a regular program, the compiler links a certain file, crt0.o, which contains the _start function. This very function will in turn be called, setting up the execution environment, then calling our standard main function. Sadly, this file is not linked when using our cross-compiler, we have to rewrite it and link it ourselves.

To summarize, we need to write 2 files:

crt0.s, which sets up the environment and calls our main function
a linker script, to define the different sections within our kernel’s code

crt0.s linker.ld

crt0.s	linker.ld
/* * The stack, required by C functions and not defined by default. * Should be placed inside the stack pointer register (esp) inside * the start function. * * The x86 ABI requires the stack to be 16-byte aligned. / #define STACK_SIZE 16384 .section .bss .align 16 stack_bottom: .skip STACK_SIZE stack_top: / * The entry point of our kernel (as defined in our linker script). * * This is the function called by the bootloader after detecting the * previously defined multiboot header (c.f. linker.ld). */ .section .text .global _kernel_start .type _kernel_start, @function _kernel_start: // Initialize stack pointer mov $stack_top, %esp // Call our main function (yet to be implemented) call kernel_main // If we exit from the kernel, loop infinitely. cli _kernel_exit_loop: hlt jmp _kernel_exit_loop // Set the size of the _kernel_start symbol for debugging purposes .size _kernel_start, . - _kernel_start	/* * The entry point of our kernel. * The bootloader will jump to this symbol to boot our kernel. / ENTRY(_kernel_start) SECTIONS { // We place our kernel at 1MiB, // It's a conventional place for it to be here. . = 1M; .text BLOCK(4K) : ALIGN(4K) { (.multiboot) (.text) } / Read-only data / .rodata BLOCK(4K) : ALIGN(4K) { (.rodata) } /* Initialized data (RW) / .data BLOCK(4K) : ALIGN(4K) { (.data) } /* Uninitialized data and stack (RW) / .bss BLOCK(4K) : ALIGN(4K) { (COMMON) *(.bss) } }


/*
 * The stack, required by C functions and not defined by default.
 * Should be placed inside the stack pointer register (esp) inside
 * the start function.
 *
 * The x86 ABI requires the stack to be 16-byte aligned.
 */

#define STACK_SIZE 16384

.section .bss
.align 16
stack_bottom:
    .skip STACK_SIZE
stack_top:

/*
 * The entry point of our kernel (as defined in our linker script).
 *
 * This is the function called by the bootloader after detecting the
 * previously defined multiboot header (c.f. linker.ld). 
 */

.section .text
.global _kernel_start
.type _kernel_start, @function

_kernel_start:

    // Initialize stack pointer
    mov $stack_top, %esp

    // Call our main function (yet to be implemented)
    call kernel_main 

    // If we exit from the kernel, loop infinitely.

    cli
_kernel_exit_loop:
    hlt
    jmp _kernel_exit_loop

// Set the size of the _kernel_start symbol for debugging purposes
.size _kernel_start, . - _kernel_start

/*
 * The entry point of our kernel.
 * The bootloader will jump to this symbol to boot our kernel.
 */
ENTRY(_kernel_start)

SECTIONS
{
    
    // We place our kernel at 1MiB, 
    // It's a conventional place for it to be here.
    . = 1M;

    .text BLOCK(4K) : ALIGN(4K)
    {
        *(.multiboot)
        *(.text)
    }

    /* Read-only data */
    .rodata BLOCK(4K) : ALIGN(4K)
    {
        *(.rodata)
    }

    /* Initialized data (RW) */
    .data BLOCK(4K) : ALIGN(4K)
    {
        *(.data)
    }

    /* Uninitialized data and stack (RW) */
    .bss BLOCK(4K) : ALIGN(4K)
    {
        *(COMMON)
        *(.bss)
    }
}

These few lines of code can seem scary, but it’s nothing to be afraid of. Let’s analyze it together!

For our processor to be able to execute our code correctly, we MUST respect a certain set of rules! We call those the ABI. The _kernel_start function only makes sure we respect those conventions before calling the main function, this includes:

The stack must be aligned on 16 bytes when calling a function
The stack grows upwards (&stack_bottom > &stack_top)

Let’s define an empty kernel_main function inside the kernel.c file, and … that’s all! We can just compile and link these 3 files together and voilà, we got our first kernel up and running!

1
2


i686-gcc -ffreestanding -c crt0.S
i686-gcc -ffreestanding crt0.o kernel.c -T ./linker.ld -nostdlib

We use the -ffreestanding and -nostdlib flag to specify GCC to compile without the standard library (c.f. bibliography)

Writing to the screen

If you built an iso file using this kernel, and booted it using qemu you would have quickly seen … nothing. But fear not, that’s normal ! We never actually did anything with our kernel, apart from booting it.

You are now going to interact with your very first hardware peripheral (apart from the CPU itself). And what’s more interactive and fulfilling to start with than writing onto the computer’s display?

What are we working with?

To interact with the screen, we have to interact with the VGA. Fortunately for us, on older machines like ours, the VGA includes a text mode buffer (starting at address 0xB8000), which lets us write characters directly to the screen. This buffer contains 25 rows of 80 characters/color combinations, each encoded on 16-bits.

The character format is as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


#include <stdint.h>

enum vga_color {
	VGA_COLOR_BLACK = 0,
    // Some colors we will not use ...
	VGA_COLOR_WHITE = 15,
};

struct vga_character {
    uint8_t character;
    enum vga_color fg : 4;
    enum vga_color bg : 4; 
} __attribute__((__packed__));

So, to write to the screen, we have to write characters to this buffer. Simple enough right?

Implementation

First, let’s clear the terminal. This is equivalent to filling the buffer with black, empty (space) characters.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32


#include <stddef.h> // For size_t

#define VGA_WIDTH  80
#define VGA_HEIGHT 25

static uint8_t terminal_row = 0;
static uint8_t terminal_column = 0;

void terminal_putchar(char c, enum vga_color fg, enum vga_color bg)
{
    static struct vga_character* terminal_buffer = (struct vga_character*)(0xB8000);

    struct vga_character character = { .character = c, .fg = fg, .bg = bg };
    terminal_buffer[terminal_row * VGA_WIDTH + terminal_column] = character;

    if (++terminal_column == VGA_WIDTH) {
		terminal_column = 0;
		if (++terminal_row == VGA_HEIGHT)
			terminal_row = 0;
	}
}

void terminal_reset(void) 
{
	terminal_row = 0;
	terminal_column = 0;
	for (size_t y = 0; y < VGA_HEIGHT; y++) {
		for (size_t x = 0; x < VGA_WIDTH; x++) {
            terminal_putchar(' ', VGA_COLOR_BLACK, VGA_COLOR_BLACK);
		}
	}
}

Then, let’s print our desired string:

1
2
3
4
5
6


void terminal_write(const char *s, size_t len)
{
    // We write each character in white over black
    for (size_t i = 0; i < len; ++i)
        terminal_putchar(s[i], VGA_COLOR_WHITE, VGA_COLOR_BLACK);
}

Now, we simply have to call this function to print any string we want to the terminal:

1
2
3
4
5
6
7


int kernel_main(void)
{
    const char string[] = "Hello, GISTRE!";
    terminal_reset();
    terminal_write(string, sizeof(string));
    return 0;
}

If you recompile again and boot onto the ISO, you should get the following:

QEMU preview

And that’s all! Congratulations, you now have successfully finished your first very own kernel!

What’s next

To summarize, in this article we went over:

How to build your own cross-compiler
How to boot a kernel
How to write a kernel
Interacting with a simple peripheral

But this is only the beginning of your journey to becoming a fully-fledged kernel developer! You now have all the keys to start learning more about this fascinating field.

If you enjoyed the process, and want to learn more, here are a few hints on where to go next:

Setup the Global Descriptor Table and learn about memory segmentation
- This table defines memory segments
- Each segment defines a privilege level field: learn how kernel rings are used !
Setup the Interrupt Descriptor Table to encounter your first hectic bugs
- Learn how interrupts are triggered by, or sent to the CPU
- Define your own handlers for each interrupt
- Handle keyboard inputs !
More generally, read through the osdev wiki, it is a great source for learning system-related concepts
Go look at what others are doing, and find what next peripheral interests you the most! You ought to have fun before anything else :)

The full source code for this article is available on GitHub.