Skip to content

A simple way to write standalone C programs for i386

Notifications You must be signed in to change notification settings

luke8086/boot2c

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

This repo demonstrates how to create C programs for the x86 platform, which:

  • boot directly from a USB drive / SD card
  • don't require any operating system code
  • don't require writing any custom drivers
  • only require small and fixed amount of assembly code
  • can access the BIOS API directly from C

To achieve this, they have several limitations:

  • they don't have access to the standard C library
  • they only work in the real-address mode
  • the final binary is limited to ~64KB
  • the available RAM is limited to ~640KB
  • the boot-loader is not guaranteed to work on every PC

Prerequisites

  • GCC with support for i386 targets
  • GNU binutils
  • QEMU / Bochs (for testing and debugging)
  • A PC supporting USB boot in BIOS ("legacy") mode
  • A spare USB disk / SD card

On Mac I recommend installing i386-elf-gcc & i386-elf-binutils packages from MacPorts, and updating Makefile accordingly

Compiling:

$ make

Testing in emulators:

$ make qemu
$ make bochs

Installing on a USB disk / SD card:

Be careful to pick the right device, this will overwrite your data!

$ make disk
$ sudo dd if=build/disk.img of=/dev/<USB DISK>

Code overview

Overall structure

The final app consists of two binaries: a boot loader and the actual program. Both are created by compiling the source code with gcc and GNU assembler, linking the program with GNU ld, and converting the resulting ELF files to flat binaries using objcopy.

To ensure that our main() is always the first code in the program binary, we move it to a separate .start section using the ENTRY_POINT macro, and emit it at the top of the file using a custom linker script

16 vs 32 bit

It is possible to use 32-bit instructions in the real-address mode, they just need to be marked with address-size and operand-size prefixes. The -m16 option for gcc, and .code16 directive in GNU assembler, do exactly that. The resulting code is 32-bit, only marked everywhere with those prefixes. It's not compatible with actual 16-bit CPUs.

Unfortunately the 32-bit addresses still cannot exceed the boundary of the segment (65535), otherwise they'll trigger an exception. QEMU doesn't emulate this behaviour, so it's useful to occasionally test with Bochs.

Bootloader

The boot loader (boot.s) loads the main program from the startup disk to the memory segment at 0x10000, and jumps to the starting point at offset 0. It assumes that BIOS will emulate the USB disk either as a HDD or a floppy. To make it more likely, it includes a basic MBR partition table. Just in case, we install it both to the main boot sector, and the boot sector of the first active partition.

Since USB booting is not a standardized process, it may not work on every PC. I only really tested on mine, and it still behaved in two different ways for a USB stick and an SD card. In case it doesn't work for you, you can try an alternative loader.

Calling BIOS services

The services provided by BIOS are primarily accessed by saving their method number and arguments to CPU registers, and triggering a software interrupt. Some of them store return values back to the registers.

To avoid writing separate assembly code for every service, we define a generic function (intr.h, intr.s), taking the interrupt number and a pointer to a struct holding register values.

Memory segmentation

Modern compilers don't have a concept of far pointers, so we can't seamlessly access memory outside of the current code / data segments. This is the main factor limiting our binary to 64K.

Fortunately, gcc has support for "address spaces" relative to FS and GS. We include set_fs and get_fs functions to set values of these registers, for example to access the text-mode video memory at b800:0000 (see bios.h)

Standard library

The standard C library depends on the operating system, so we can't use it in standalone programs (hence the -ffreestanding and -nostdlib flags). However, for certain operations, like initialising a struct on the stack, the compiler may still generate implicit calls to standard functions, like memcpy. In such cases, we just need to provide our own versions (see util.h and util.c).

Troubleshooting

More likely than not, working on standalone programs will require some tinkering. Below are some hints:

Installing the syslinux boot loader

In case the provided boot loader doesn't work, you may experiment with the one of syslinux:

$ wget https://mirrors.edge.kernel.org/pub/linux/utils/boot/syslinux/6.xx/syslinux-6.03.tar.gz
$ tar zxf syslinux-6.03.tar.gz
$ make disk
$ dd if=syslinux-6.03/bios/mbr/mbr.bin of=build/disk.img conv=notrunc

Disassembling files

By default, objdump will disassemble our binaries with no complaints, showing a completely incorrect output. Since the code is compiled to run in 16-bit mode, we need to add -m i8086:

$ objdump -D -m i8086 build/app.o
$ objdump -D -m i8086 build/app.elf
$ objdump -D -b binary -m i8086 build/app.bin

Debugging with Bochs

Bochs is one of the slowest emulators, but often more accurate than others. Its debugger seems to handle 16-bit code slightly better than GDB with QEMU. The xchg %bx, bx instruction can be used to set a breakpoint, in C it's available using BOCHS_BREAKPOINT macro. The system clock is completely inaccurate, so it's not that useful for testing games / animations.

Running in VirtualBox

The easiest way to test in VirtualBox is by attaching disk.img as a raw image of a floppy. However, it imposes a limit on the amount of sectors that can be read (with a single BIOS call) to 0x48. So you'll need to replace mov $0x027f, %ax with mov $0x0248, %ax in boot.s

Writing assembly functions

In case you want to write any custom assembly function, be sure to use 32-bit ret (i.e. retl in GNU as), otherwise it'll leave the stack shifted by 2 bytes.

References