1. Kernel 101 – Let’s write a Kernel

    by Arjun Sreedharan

    Hello World,

    Let us write a simple kernel which could be loaded with the GRUB bootloader on an x86 system. This kernel will display a message on the screen and then hang.

    One does simply write a kernel



    How does an x86 machine boot

    Before we think about writing a kernel, let’s see how the machine boots up and transfers control to the kernel:

    The x86 CPU is hardwired to begin execution at the physical address [0xFFFFFFF0]. It is in fact, the last 16 bytes of the 32-bit address space. This address just contains a jump instruction to the address in memory where BIOS has copied itself.

    Thus, the BIOS code starts its execution.  BIOS first searches for a bootable device in the configured boot device order. It checks for a certain magic number to determine if the device is bootable or not.

    Once the BIOS has found a bootable device, it copies the contents of the device’s first sector into RAM starting from physical address [0x7c00]; and then jumps into the address and executes the code just loaded. This code is called the bootloader.

    The bootloader then loads the kernel at the physical address [0x100000]. The address [0x100000] is used as the start-address for all big kernels on x86 machines.



    What all do we need?

    * An x86 computer (of course)
    * Linux
    * NASM assembler
    * gcc
    * ld (GNU Linker)
    * grub



    Source Code

    Source code is available at my Github repository - mkernel



    The entry point using assembly

    We like to write everything in C, but we cannot avoid a little bit of assembly. We will write a small file in x86 assembly-language that serves as the starting point for our kernel. All our assembly file will do is invoke an external function which we will write in C, and then halt the program flow.

    How do we make sure that this assembly code will serve as the starting point of the kernel?

    We will use a linker script that links the object files to produce the final kernel executable. (more explained later)  In this linker script, we will explicitly specify that we want our binary to be loaded at the address [0x100000]. This address, as I have said earlier, is where the kernel is expected to be. Thus, the bootloader will take care of firing the kernel’s entry point.

    Here’s the assembly code:

    ;;kernel.asm
    bits 32			;nasm directive - 32 bit
    section .text
    
    global start
    extern kmain	        ;kmain is defined in the c file
    
    start:
      cli 			;block interrupts
      call kmain
      hlt		 	;halt the CPU
    

    The first instruction bits 32 is not an x86 assembly instruction. It’s a directive to the NASM assembler that specifies it should generate code to run on a processor operating in 32 bit mode. It is not mandatorily required in our example, however is included here as it’s good practice to be explicit.

    The second line begins the text section (aka code section). This is where we put all our code.

    global is another NASM directive to set symbols from source code as global. By doing so, the linker knows where the symbol start is; which happens to be our entry point.

    kmain is our function that will be defined in our kernel.c file. extern declares that the function is declared elsewhere.

    Then, we have the start function, which calls the kmain function and halts the CPU using the hlt instruction. Interrupts can awake the CPU from an hlt instruction. So we disable interrupts beforehand using cli instruction. cli is short for clear-interrupts.

     

    The kernel in C

    In kernel.asm, we made a call to the function kmain(). So our C code will start executing at kmain():

    /*
    *  kernel.c
    */
    void kmain(void)
    {
    	char *str = "my first kernel";
    	char *vidptr = (char*)0xb8000; 	//video mem begins here.
    	unsigned int i = 0;
    	unsigned int j = 0;
    	//clear all
    	while(j < 80 * 25 * 2) {
    		//blank character
    		vidptr[j] = ' ';
    		//attribute-byte: light grey on black screen	
    		vidptr[j+1] = 0x07; 		
    		j = j + 2;
    	}
    	j = 0;
    	while(str[j] != '\0') {
    		vidptr[i] = str[j];
    		vidptr[i+1] = 0x07;
    		++j;
    		i = i + 2;
    	}
    	return;
    }
    

    All our kernel will do is clear the screen and write to it the string “my first kernel”.

    First we make a pointer vidptr that points to the address [0xb8000]. This address is the start of video memory in protected mode. The screen’s text memory is simply a chunk of memory in our address space. The memory mapped input/output for the screen starts at [0xb8000] and supports 25 lines, each line contain 80 ascii characters.

    Each character element in this text memory is represented by 16 bits (2 bytes), rather than 8 bits (1 byte) which we are used to.  The first byte should have the representation of the character as in ASCII. The second byte is the attribute-byte. This describes the formatting of the character including attributes such as color.

    To print the character s in green color on black background, we will store the character s in the first byte of the video memory address and the value [0x02] in the second byte.
    0 represents black background and 2 represents green foreground.


    Have a look at table below for different colors:

    0 - Black, 1 - Blue, 2 - Green, 3 - Cyan, 4 - Red, 5 - Magenta, 6 - Brown, 7 - Light Grey, 8 - Dark Grey, 9 - Light Blue, 10/a - Light Green, 11/b - Light Cyan, 12/c - Light Red, 13/d - Light Magenta, 14/e - Light Brown, 15/f – White.



    In our kernel, we will use light grey character on a black background. So our attribute-byte must have the value [0x07].

    In the first while loop, the program writes the blank character with [0x07] attribute all over the 80 columns of the 25 lines. This thus clears the screen.

    In the second while loop, characters of the null terminated string “my first kernel” are written to the chunk of video memory with each character holding an attribute-byte of [0x07].

    This should display the string on the screen.



    The linking part

    We will assemble kernel.asm with NASM to an object file; and then using GCC we will compile kernel.c to another object file. Now, our job is to get these objects linked to an executable bootable kernel.

    For that, we use an explicit linker script, which can be passed as an argument to ld (our linker).

    /*
    *  link.ld
    */
    OUTPUT_FORMAT(elf32-i386)
    ENTRY(start)
    SECTIONS
     {
       . = 0x100000;
       .text : { *(.text) }
       .data : { *(.data) }
       .bss  : { *(.bss)  }
     }
    

    First, we set the output format of our output executable to be 32 bit Executable and Linkable Format (ELF). ELF is the standard binary file format for Unix-like systems on x86 architecture.

    ENTRY takes one argument. It specifies the symbol name that should be the entry point of our executable.

    SECTIONS is the most important part for us. Here, we define the layout of our executable. We could specify how the different sections are to be merged and at what location each of these is to be placed.

    Within the braces that follow the SECTIONS statement, the period character (.) represents the location counter.
    The location counter is always initialized to [0x0] at beginning of the SECTIONS block. It can be modified by assigning a new value to it.

    Remember, earlier I told you that kernel’s code should start at the address [0x100000]. So, we set the location counter to [0x100000].

    Have look at the next line .text : { *(.text) }

    The asterisk (*) is a wildcard character that matches any file name. The expression *(.text) thus means all .text input sections from all input files.

    So, the linker merges all text sections of the object files to the executable’s text section, at the address stored in the location counter. Thus, the code section of our executable begins at [0x100000].

    After the linker places the text output section, the value of the location counter will become
    0x1000000 + the size of the text output section.

    Similarly, the data and bss sections are merged and placed at the then values of location-counter.



    Grub and Multiboot

    Now, we have all our files ready to build the kernel. But, since we like to boot our kernel with the GRUB bootloader, there is one step left.

    There is a standard for loading various x86 kernels using a boot loader; called as Multiboot specification.

    GRUB will only load our kernel if it complies with the Multiboot spec.

    According to the spec, the kernel must contain a header (known as Multiboot header) within its first 8 KiloBytes.

    Further, This Multiboot header must contain 3 fields that are 4 byte aligned namely:

    • a magic field: containing the magic number [0x1BADB002], to identify the header.
    • a flags field: We will not care about this field. We will simply set it to zero.
    • a checksum field: the checksum field when added to the fields ‘magic’ and ‘flags’ must give zero.

    So our kernel.asm will become:

    ;;kernel.asm
    
    ;nasm directive - 32 bit
    bits 32
    section .text
            ;multiboot spec
            align 4
            dd 0x1BADB002            ;magic
            dd 0x00                  ;flags
            dd - (0x1BADB002 + 0x00) ;checksum. m+f+c should be zero
    
    global start
    extern kmain	        ;kmain is defined in the c file
    
    start:
      cli 			;block interrupts
      call kmain
      hlt		 	;halt the CPU
    

    The dd defines a double word of size 4 bytes.

    Building the kernel

    We will now create object files from kernel.asm and kernel.c and then link it using our linker script.

    nasm -f elf32 kernel.asm -o kasm.o
    

    will run the assembler to create the object file kasm.o in ELF-32 bit format.

    gcc -m32 -c kernel.c -o kc.o
    

    The ‘-c ’ option makes sure that after compiling, linking doesn’t implicitly happen.

    ld -m elf_i386 -T link.ld -o kernel kasm.o kc.o
    

    will run the linker with our linker script and generate the executable named kernel.



    Configure your grub and run your kernel

    GRUB requires your kernel to be of the name pattern kernel-<version>. So, rename the kernel. I renamed my kernel executable to kernel-701.

    Now place it in the /boot directory. You will require superuser privileges to do so.

    In your GRUB configuration file grub.cfg you should add an entry, something like:

    title myKernel
    	root (hd0,0)
    	kernel /boot/kernel-701 ro
    



    Don’t forget to remove the directive hiddenmenu if it exists.

    Reboot your computer, and you’ll get a list selection with the name of your kernel listed.

    Select it and you should see:

    image

    That’s your kernel!!



    PS:
    * It’s always advisable to get yourself a virtual machine for all kinds of kernel hacking.

    * To run this on grub2 which is the default bootloader for newer distros, your config should look like this (Thanks to Rubén Laguna from comments for the config):

    menuentry 'kernel 7001' {
    	set root='hd0,msdos1'
    	multiboot /boot/kernel-7001 ro
    }
    



    * Also, if you want to run the kernel on the qemu emulator instead of booting with GRUB, you can do so by:

    qemu-system-i386 -kernel kernel
    








  2. Disclaimer: The views expressed here are solely those of the author in his private capacity and do not in any way represent the views of the author's employer or any organization associated with the author.

    1. gamer-fox reblogged this from computerpile
    2. memiux reblogged this from arjunsreedharan
    3. computerpile reblogged this from computeruser-computerloser
    4. compiler-errors reblogged this from see-plus-plus
    5. williamthebrand reblogged this from computeruser-computerloser
    6. computeruser-computerloser reblogged this from see-plus-plus and added:
      YES YES YES
    7. see-plus-plus reblogged this from arjunsreedharan
    8. akhlog reblogged this from arjunsreedharan
    9. theshadowzero reblogged this from shecodesandstyle
    10. robotsthatkill reblogged this from arjunsreedharan
    11. shecodesandstyle reblogged this from arjunsreedharan



















































Interesting Things:
exploresion.org







Recent Posts:
Simplicity is the ultimate sophistication. (Leonardo Da Vinci)
©
Arjun Sreedharan 2013