Introducing Metalkit

Metalkit is another of my random side-projects. It’s a very simple library for writing programs that run on IA32 (x86) machines on the bare metal. It isn’t an operating system, but it does contain some of the low-level pieces you might use to create one.

I created it partly for fun and for the challenge, and partly to use as a framework for low-level hardware testing at work. It is open source, released under an MIT-style license.

Features currently include:

  • A 512-byte bootloader that works either as a floppy disk MBR or a GNU Multiboot image. When you build a program with Metalkit, the same binary image can be used either as a raw floppy disk image or as a “kernel” image in GRUB. This makes it easy to use your programs on virtual machines (VMware, QEMU), emulators (Bochs), or real machines.
  • Basic PCI bus support. You can scan for PCI devices, find out what resources (I/O ports, memory, IRQs) they have, and poke at their configuration registers.
  • VGA text mode.
  • A very tiny zlib-compatible decompressor, the “puff” reference implementation of DEFLATE.
  • Low-level support for the PIT timer.
  • A small, efficient, and powerful interrupt subsystem. ISR trampolines are assembled at runtime, saving space in the binary. Any ISR can execute the equivalent of a longjmp(3) on return, making simple thread context-switching very easy. Includes basic PIC interrupt routing. Includes default fault handlers which dump CPU registers and the stack any time an unhandled fault occurs.

Metalkit could be useful for educational purposes, because programs written with Metalkit are extremely small and self-contained. This example is a complete Metalkit program which lists all devices on the PCI bus:

#include "types.h"
#include "vgatext.h"
#include "pci.h"
#include "intr.h"

    PCIScanState busScan = {};


    VGAText_WriteString("Scanning for PCI devices:\n\n");

    while (PCI_ScanBus(&busScan)) {
        VGAText_Format(" %2x:%2x.%1x  %4x:%4x\n",
        busScan.addr.bus, busScan.addr.device,
        busScan.addr.function, busScan.vendorId,


    return 0;

This example compiles to a 2962-byte image, and uses only about 1500 lines of library code. This is great for educational purposes, because it is practical to understand the purpose of every byte in that compiled image– and when this example is running, that’s the only code running on your computer.

Another example included with the source is a simple pre-emptive thread scheduler implemented in 152 lines of C. Metalkit itself doesn’t know anything about threads or multitasking, but it’s possible to use Metalkit’s interrupt trampoline as a thread context switch. This example creates two busy-looping threads. Each thread prints its name, and the “Task 2” thread also increments a counter. The example switches threads round-robin style on every timer interrupt. Here’s the tiny example running in Bochs:

If you want to play with Metalkit, all you need is an x86-compatible PC and a copy of the GNU toolchain (GCC and Binutils). Source code is now at
Also, if you’re interested in OS development or just hacking on the bare metal, the Wiki is an invaluable resource.


Playstation controller extender

The little hardware project I started almost 2 months ago is finally done. Completely finished. Bug free! Well, almost. It is, however, in a fully assembled state with firmware that is actually pretty usable.


The Unicone2 is the result of my mini-quest to extend Playstation 2 controllers over long lengths of cat5 cable. A while back I did something similar, extending Game Cube and N64 controllers over ethernet. That was the original Universal Controller Emulator, or Unicone. It used an FPGA, and a pair of Linux boxes. This time the design focus is low-latency, no PC required, and full support for Playstation 2 Dual Shock controllers and the Guitar Hero controller.

The Unicone2 went through a few redesigns before I settled on something I liked. The final version is based on a pair of microcontrollers and an asymmetric RS-422 link over cat5 cable. The remote end, where the physical controllers live, runs a PIC16F877A microcontroller. The base station, where all the consoles sit, uses the awesome Parallax Propeller, an 8-core microcontroller.

The remote unit’s job is pretty simple. It initializes and polls its two controllers, attempting to maintain a constant polling rate. The controller data is streamed over a very low-latency 250kbps RS-422 link. In its spare time, the PIC reads lower-bandwidth data (LED status, force feedback state) over an incoming 19200-baud RS-422 link.

The base station has a much harder job. It needs to emulate four controllers and receive data from two remote units simultaneously. These are all high-speed (250 to 500 kbps) asynchronous serial streams, with no flow control. It’s pretty much an intractable workload for most microcontrollers. I spent a long time trying to solve this problem with an FPGA, just like I did with the original Unicone project. This was working okay, but it was tedious. Playstation 2 controllers are even more complex than N64 controllers, and the problem doesn’t map well to the hardware domain.

When I first read about the Propeller, it seemed like a perfect solution to this problem. Just write one controller emulator, and run four copies of it simultaneously on different processors. The controller emulators and the RS-422 receivers communicate over shared memory, making the whole device quite low-latency. This also leaves plenty of CPU power and memory to emulate other flavours of video game controller, as well as leaving space to implement special effects like controller mixing and synchronized record/replay of controller events.


I’ll end the hardware rambling for today. Paul or I might have to post soon with some more info about the gameplay implications of the Unicone2. Besides acting as an extender, it can already act as a controller crossbar switch and mixer, which will let us experiment with improvised multiplayer action on single-player games.

Firmware and schematics are in Subversion.