Tag Archives: reverse engineering

Temporal Hex Dump

After building some hardware to trace and inject data on the Nintendo DSi’s RAM bus, it became obvious pretty fast that there’s a lot of data there, and (as far as I know) no good tools for analyzing these sorts of logs.

The RAM tracer has already given us a lot of insight into how the DSi works by virtue of letting us inspect the boot process, the inter-processor communication, and most of the code that runs on the system. But all of that knowledge comes in an indirect way, from using the RAM tracer as a platform to run other experiments. I’ve been interested in figuring out whether there’s a way to use the RAM trace itself to help understand a system’s dynamic behaviour.

The RAM is on a packet-oriented bus, so it would make sense to have a tool that looks kind of like a packet-based protocol analyzer. Think Wireshark, but for memory.

But there are also a lot of complex patterns that show up over time. As the DS loads a file, or initializes itself, or renders frame after frame of a UI, there are obvious patterns that emerge. So it also might make sense to have a visual tool, like vusb-analyzer.

Unfortunately, both of these approaches ignore the spatial organization of memory. The bus is a stream of packets that say ‘read’ or ‘write’, but the contents of RAM as a whole is more like a file that’s changing over time. Like in a version control system.

So the tool I’ve been imagining is kind of a hybrid of these. It would have a graphical timeline that helps you visually navigate through large datasets and identify timing patterns. It would have a packet-by-packet listing of the reads and writes. And most importantly, it would be a hex dump tool. But instead of showing a hex dump of a static file, it would be a two-dimensional hex dump. The hex dump shows space, but you can also scrub forward or backward in time, and watch the hex dump change. The hex dump could be annotated with colors, to show which data is about to change, or which data recently changed. You could right click on a byte, and see hyperlinks to the memory transactions that are responsible for that byte’s previous and next values.

As far as I know, nobody’s written a tool like this. So I have no idea how useful it will actually be for reverse engineering or performance optimization, but it seems like a promising experiment at least. So far I’ve been working on an indexing and caching infrastructure to make it possible to interactively browse these huge memory dumps, and I’ve been working on the visual timeline widget. Here’s a quick screencast:

The top section shows read/write/zero activity binned by address, with each vertical pixel representing about 64 kB. The horizontal axis is time, with continuous zooming. The bottom section of the graph shows bandwidth, color-coded according to read/write/zero. Blue pixels are reads, reds are write, and orange is a write of a zero byte.

This log file is about a gigabyte of raw data, or about 2 minutes of wallclock time. It shows the Opera browser on the Nintendo DSi loading a very large web page, then crashing. You can see its heap growing, and you can watch the memory access patterns of code, data, and inter-processor communication.

There’s a lot of room for improvement, but I’m optimistic that this will be at least a useful tool for understanding the DSi, and maybe even a more generally applicable tool for reverse engineering and optimization.

As usual, the source is in svn if anyone’s interested. It’s implemented with C++, wxWidgets, sqlite3, and Boost. I’ve only tested it on Linux, but it “should” be portable.

DSi RAM tracing

It seems like a lot of people have been seeing my Flickr photostream and wondering what I must be up to- especially after my photos got linked on hackaday and reddit. Well, I’ve been meaning to write a detailed blog post explaining it all- but I keep running out of time. So I guess a “short” blog post is better than nothing.

The executive summary is that I’ve been working on making it possible to run homebrew games on the DSi. If you’ve used The Homebrew Channel on the Wii, you know what I’m talking about- we’d basically like to have the same thing for the DSi. And as it turns out, the DSi has a lot in common with the Wii.

I haven’t been working on this project alone. I’ve been collaborating with the excellent software and hardware engineers in Team Twiizers, the group responsible for The Homebrew Channel. Bushing has written a couple blog posts about our DSi efforts so far: one on the RAM tracing effort in general, and one with a sample trace.

My role in this project lately has been building hardware and software for a tool which is effectively a low-level CPU debugger for the DSi. It attaches to the RAM bus between the RAM and CPU, it’s capable of tracing all traffic in ‘real-time’ (after slowing down the system clock so the required bandwidth fits in USB 2.0), and just this week I got RAM injection working. The hardware interposes on one of the chip select signals between the CPU and RAM. If an address floats by that we want to patch, the hardware can quickly disable the real RAM chip and drive new data onto the bus as the CPU continues to read. It’s a very effective technique, and my hardware has a pretty capable system for storing and retrieving up to 16 kB of patch data now.

So why is all the complex hardware needed? Well, we hit two big dead-ends:

  • Several of us managed to run code from within a DSi-enhanced game using modified save files, but most of the interesting hardware on the DSi (like the SD card slot, and the internal flash memory) is disabled when these games are running. The earlier Team Twiizers Classic Word Games hack, WinterMute’s cookhack, and my own cookinject all suffer from this problem.
  • I’ve been able to read and write flash memory for a while. This isn’t as interesting as it may sound, since nearly all of the flash is encrypted. However, by running a lot of tests where we modified or compared the encrypted data, we were still able to gain a lot of knowledge about the layout of the flash memory, and some information about how the system boots.

What we’d really like to do is understand more about how the console’s main menu works, and how software is installed and loaded on the DSi. The best way to do this seemed to be RAM tracing. It required an FPGA and a lot of crazy soldering under a microscope, but now it’s working and we’re well on our way to unlocking the secrets of DSi homebrew.

If you’re interested in following our efforts to make homebrew possible on the DSi, definitely subscribe to Bushing’s blog at hackmii.com. He understands the Wii (and by extension, the DSi) better than I ever will, and he’s great at writing about it.

I have a Flickr set with lots more photos, but these are two of my favorite. The first is a photo of the entire setup as it’s currently configured. The second is a picture of the craziest of the microscopic solder joints in this project. This is where I separated the chip-select signal I mentioned above, giving the FPGA the ability to sit between the CPU and RAM. The trace I’m tapping into is buried under four other PCB layers. For a sense of scale, the orange enameled wire is 32 AWG, or about 0.2mm in diameter. Normal 30 AWG wire-wrapping wire is actually only the third smallest size of wire I’ve been using in this project 🙂

Debugging setup with new FPGA

Via excavation closeup

If you’re interested in the C and Verilog source code, it’s in Subversion as usual.

DSi hacking, BGA rework

I got a Nintendo DSi earlier this month. It’s a really cute little console, and a nice revision to the DS Lite. Screens are slightly larger, CPU is about twice as fast, 4x the RAM. And of course it has these camera thingies. Other folks have teardowns online with internal photos aplenty 🙂

So, I spent a week or so tooling around with the DSi doing normal consumer and/or homebrew things with it. Played some The World Ends With You, got an Acekard 2i and tried running some homebrew. Its DS compatibility mode seems to be pretty good. The only compatibility bug I bumped into was running my work-in-progress Robot Odyssey port. It mostly worked, but there was a little bit of crackling in the audio. Not surprising, I’m doing the audio in a somewhat unconventional way, and it doesn’t work properly in any DS emulator either 🙂

The DSi is pretty cute, but it would be nice if we could use its new features for homebrew software. It would be especially cool if you could run homebrew directly from its built-in SD card reader, without the need for any special-purpose hardware. That would really open the door to some cool indie games that nearly anyone could play.

Need to satisfy my curiosity, then. Out come the jewler’s screwdrivers, antistatic mat, and SMD rework tools…

There are a few other people working on hacking the DSi from other avenues. Team Twiizers just released a video of a save game exploit that they’re using to run code in DSi mode. This is pretty cool, and as far as I know it’s the first time a hobbyist has been able to run code in DSi mode. I’m not sure how much if any of the new hardware they’ve managed to reverse engineer through this hack.

There are also a handful of Wii hackers that have noticed that the DSi has pretty much the same file format for its downloadable games, so they’re trying to come up with a way to install homebrew DSiWare apps. Unfortunately, they need the system’s private key- and brute-forcing it doesn’t seem too promising. One way to get this private key would be to extract it from the DSi’s firmware.

The thing that interests me the most right now is figuring out how the DSi boots. There are plenty of other layers of software that you can exploit from the outside- but the bootloader and built-in firmware seems like the best starting point for any serious effort to reverse engineer the DSi’s hardware.

The DSi has upgradeable firmware, and there’s only one visible place that it could be stored: a 256 MB NAND flash chip. This is actually an embedded MMC (eMMC) chip from Samsung. It’s electrically pretty much the same as an MMC card, but in a tiny BGA form factor. It is wired directly up to the CPU, which would indicate that there is some kind of hardware or software bootloader that knows how to pull code off of the eMMC chip during early boot. This probably means there’s a mask ROM in the same package as the CPU. It would also make sense that this ROM would know how to write an initial firmware image to the eMMC chip during manufacturing, since there is no other way to access the eMMC chip after it’s soldered to the DSi’s main board.

I was hoping that the MMC data lines would be accessible via some of the many test points. None of them were obviously related to the NAND, though. I tried probing the SPI bus, where the original DS kept its firmware, but the eMMC chip was definitely on a separate bus. It didn’t help at all that the DSi refuses to boot unless every single one of its peripherals are attached, so this makes it rather hard to poke around the board’s test points while it’s running.

At this point I assumed that the traces running from eMMC chip to CPU were completely buried. My last recourse: desolder the eMMC chip, and hook it up to some kind of test jig that would let me read back its contents. Then perhaps I could copy it to a normal MMC card, and wire up an MMC socket on the DSi.

After practicing my BGA rework technique a bit, I desoldered the eMMC chip. If you power the DSi on without the eMMC chip, it displays a nice hexadecimal error code. This would seem to indicate that there is indeed a mask ROM in the CPU package- a hardware bootloader wouldn’t be smart enough to display a “nice” error like this.

Under the NAND flash

There was a pleasant surprise underneath the eMMC chip: All of the data signals were in fact accessible on nearby components or vias. I didn’t find them earlier, since I couldn’t really poke around with the ‘scope and have the board access the eMMC at the same time. But, armed with this knowledge, I could switch to a simpler approach: Solder the eMMC chip back on, then build a passive sniffer that tells me exactly what reads and writes the DSi’s CPU makes.

By some miracle, I managed to get the eMMC chip soldered back on successfully. The DSi’s mainboard is a huge pain to do BGA rework on, at least with the tools I have. It has so many ground/power planes and fills that the board is really hard to keep hot with just a hot-air rework tool.

My poor DSi then is ‘mostly’ back in one piece. Now I’m working on the next step: Hooking up an FPGA that can dump all the memory reads/writes over USB 🙂

More stuff: There are a few more pictures on flickr, and I started a thread with a few more technical details on the gbadev forums.

Robot Odyssey DS: First screenshots

This is nowhere near ready for prime-time, but:



Yep, it’s Robot Odyssey for the Nintendo DS. I literally just got this working yesterday, so please don’t ask for any precompiled binaries. If you don’t already know where the source code is, you really don’t want to see it 🙂

Before you ask, this is not a general-purpose DOS emulator for the DS. It’s actually a static binary translator which does most of the work in porting a DOS game to the DS, but there’s still an awful lot of manual intervention required. I only really developed the translator with Robot Odyssey in mind, so there are sure to be features missing that you’d need in order to use it with other games.

My intent is to turn this into a real port of Robot Odyssey to the DS, not just an emulation. In particular, I’d like to re-do the load/save game UI to support timestamps and thumbnails, and I’d like a soldering mode that makes use of the DS’s touchscreen.

Robot Odyssey Chip Disassembler

I’ve been spending more time hacking on Robot Odyssesy lately. Most of it has had a specific purpose… I’ll write a separate blog post on that project once it’s a bit more fully baked. In the mean time, the reverse engineering has had some useful side-effects.

Chip Simulation

If you haven’t heard of Robot Odyssey, it’s a game where you build logic circuits that control robots in order to solve puzzles. You can build circuits directly inside the robots, but you can also program them into ‘chips’- small 8-pin integrated circuits. The game also comes with several chips: a clock generator, 4-bit counter, and the infamous “wallhugger”, a chip you can stick in your robot to have it follow the walls of a maze.

Once you ‘burn’ a circuit into a chip, you can load and save it but you can’t see inside. So this has made the built-in chips a bit of a mystery. There’s no way in-game to see how they work, and for all we know they could be ‘magic’ in some way. I had speculated a bit on how they store the compiled chips. Was it some kind of super-optimized machine code? Or maybe some kind of encoded Programmable Array Logic that was especially quick to simulate on an 8086 processor?

Well, it turns out that the code for simulating chips is clever in places, but the file format is quite straightforward. It’s sort of halfway between an electrical netlist and a bytecode language. On every in-game clock tick, every bytecode instruction in the chip executes in order. Instructions can represent logic gates (AND, OR, XOR, NOT, RS flip-flop), or they can enter/exit a nested chip. Bytecode parameters can be one of two data types: The state of a pin, or a list of pin addresses.

Here’s a really simple circuit inside the Robot Odyssey prototype chip:

And this is the top of the .CSV (Chip Save) file that the game produces:

00000000  00 00 00 00 00 00 00 00  01 00 00 00 01 ff 07 00  |................|
00000010  02 00 09 ff 00 03 00 0a  ff ff ff ff ff 00 00 00  |................|
  • First 8 bytes: Pin states. All pins are off.
  • First opcode (01). This is an AND gate.
    • Pin states for the AND’s inputs. Both are off. (00 00)
    • A list of 16-bit addresses that will receive this gate’s output.
      • Address of Pin 2 (00 01)
      • End of list (FF)
  • Second opcode (07). This exits a chip. Since this is the outermost chip, after this opcode is the end of the chip’s electrical data. The parameters for this opcode are a list of lists which describes ‘nodes’, or places where we need to copy a pin state directly from one place to another.
    • First list:
      • Source address (00 02). This is Pin 3.
      • First destination (00 09). This is the first pin of the AND gate.
      • End of list (FF)
    • Second list:
      • Source address (00 03). This is Pin 4.
      • First destination (00 0a). This is the second pin of the AND gate.
      • End of list (FF)
    • End of list (FF)
  • Afterwards is garbage. In this case, (FF FF FF). This is probably left over in memory at the time the chip was compiled, and doesn’t mean anything. The chip interpreter doesn’t read this data.

Chip Disassembler

So, I figured out (I think) the entire file format, and wrote a Python script to disassemble it into something a little more human-readable. The above example chip disassembles to:

Sample Chip Disassembly

Chip1 {
    HELP       '1'
    HELP       '2 OUTPUT'
    HELP       '3 INPUT 1'
    HELP       '4 INPUT 2'
    HELP       '5'
    HELP       '6'
    HELP       '7'
    HELP       '8'
    PIN        Chip1_pin2<0> 'out'
    PIN        Chip1_pin3<0> 'in'
    PIN        Chip1_pin4<0> 'in'
    AND        AND1_in0<0> AND1_in1<0> => [Chip1_pin2<0>]
    Node       Chip1_pin3<0> => [AND1_in0<0>]
    Node       Chip1_pin4<0> => [AND1_in1<0>]

Wallhugger Disassembly

For a more exciting example, now we can finally see inside the wallhugger chip! Converting this to a graphical schematic is left as an exercise to the reader 🙂

Chip1 {
    HELP       'Wall Hugger'
    HELP       '1 Top thruster    ^ Hook up pins'
    HELP       '2 Left bumper     ^ as described.'
    HELP       '3 Left thruster   ^ This chip will'
    HELP       '4 Bottom bumper   ^ cause a robot'
    HELP       '5 Bottom thruster ^ to follow(hug)'
    HELP       '6 Right bumper    ^ the walls of a'
    HELP       '7 Right thruster  ^ room.'
    HELP       '8 Top bumper      ^'
    PIN        Chip1_pin1<0> 'out'
    PIN        Chip1_pin2<0> 'in'
    PIN        Chip1_pin3<0> 'out'
    PIN        Chip1_pin4<0> 'in'
    PIN        Chip1_pin8<1> 'in'
    PIN        Chip1_pin7<1> 'out'
    PIN        Chip1_pin6<0> 'in'
    PIN        Chip1_pin5<0> 'out'
    OR         OR1_in0<0> OR1_in1<0> => [Chip1_pin5<0>]
    OR         OR2_in0<0> OR2_in1<0> => [OR1_in1<0>]
    OR         OR3_in0<0> OR3_in1<0> => [Chip1_pin3<0>]
    OR         OR4_in0<0> OR4_in1<0> => [OR3_in1<0>]
    AND        AND1_in0<0> AND1_in1<1> => [OR2_in1<0>, OR4_in1<0>]
    OR         OR5_in0<1> OR5_in1<0> => [Chip1_pin7<1>]
    OR         OR6_in0<0> OR6_in1<0> => [OR5_in1<0>]
    AND        AND2_in0<0> AND2_in1<0> => [OR2_in0<0>, OR6_in1<0>]
    AND        AND3_in0<0> AND3_in1<0> => [OR6_in0<0>, OR8_in0<0>]
    OR         OR7_in0<0> OR7_in1<0> => [Chip1_pin1<0>]
    OR         OR8_in0<0> OR8_in1<0> => [OR7_in1<0>]
    AND        AND4_in0<0> AND4_in1<0> => [OR4_in0<0>, OR8_in1<0>]
    NOT        NOT1_in0<1> => [AND1_in0<0>, AND3_in0<0>, AND4_in1<0>, AND2_in0<0>]
    OR         OR9_in0<1> OR9_in1<0> => [NOT1_in0<1>]
    OR         OR10_in0<0> OR10_in1<0> => [OR9_in1<0>]
    OR         OR11_in0<0> OR11_in1<1> => [OR9_in0<1>]
    Chip2 {
        PIN        Chip2_pin1<1>
        PIN        Chip2_pin2<0>
        PIN        Chip2_pin3<0>
        PIN        Chip2_pin4<0>
        PIN        Chip2_pin8<1>
        PIN        Chip2_pin7<0>
        PIN        Chip2_pin6<0>
        PIN        Chip2_pin5<0>
        FF<01>     FF1_in0<0> FF1_in1<1> => [Chip2_pin5<0>] []
        OR         OR12_in0<0> OR12_in1<0> => [FF1_in0<0>]
        OR         OR13_in0<1> OR13_in1<0> => [FF1_in1<1>]
        OR         OR14_in0<0> OR14_in1<0> => [OR13_in1<0>]
        OR         OR15_in0<0> OR15_in1<0> => [OR12_in0<0>]
        OR         OR16_in0<0> OR16_in1<0> => [OR15_in1<0>]
        FF<01>     FF2_in0<0> FF2_in1<1> => [Chip2_pin6<0>] []
        OR         OR17_in0<0> OR17_in1<0> => [FF2_in0<0>]
        OR         OR18_in0<0> OR18_in1<1> => [FF2_in1<1>]
        OR         OR19_in0<0> OR19_in1<0> => [OR17_in0<0>]
        OR         OR20_in0<0> OR20_in1<0> => [OR19_in1<0>]
        OR         OR21_in0<0> OR21_in1<0> => [OR18_in0<0>]
        FF<01>     FF3_in0<0> FF3_in1<1> => [Chip2_pin7<0>] []
        OR         OR22_in0<0> OR22_in1<0> => [FF3_in0<0>]
        OR         OR23_in0<0> OR23_in1<1> => [FF3_in1<1>]
        OR         OR24_in0<0> OR24_in1<0> => [OR22_in0<0>]
        OR         OR25_in0<0> OR25_in1<0> => [OR24_in1<0>]
        OR         OR26_in0<0> OR26_in1<0> => [OR23_in0<0>]
        FF<10>     FF4_in0<1> FF4_in1<0> => [Chip2_pin8<1>] []
        OR         OR27_in0<1> OR27_in1<0> => [FF4_in0<1>]
        OR         OR28_in0<0> OR28_in1<0> => [FF4_in1<0>]
        OR         OR29_in0<0> OR29_in1<1> => [OR27_in0<1>]
        OR         OR30_in0<1> OR30_in1<0> => [OR29_in1<1>]
        OR         OR31_in0<0> OR31_in1<0> => [OR28_in0<0>]
        Node       Chip2_pin1<1> => [OR30_in0<1>, OR13_in0<1>, OR18_in1<1>, OR23_in1<1>]
        Node       Chip2_pin2<0> => [OR25_in0<0>, OR31_in1<0>, OR21_in0<0>, OR14_in1<0>]
        Node       Chip2_pin3<0> => [OR20_in0<0>, OR28_in1<0>, OR26_in0<0>, OR14_in0<0>]
        Node       Chip2_pin4<0> => [OR16_in0<0>, OR31_in0<0>, OR21_in1<0>, OR26_in1<0>]
    Node       Chip1_pin2<0> => [OR7_in0<0>, OR10_in0<0>, Chip2_pin2<0>]
    Node       Chip1_pin4<0> => [OR3_in0<0>, OR10_in1<0>, Chip2_pin3<0>]
    Node       Chip1_pin8<1> => [OR5_in0<1>, OR11_in1<1>, Chip2_pin1<1>]
    Node       Chip1_pin6<0> => [OR1_in0<0>, OR11_in0<0>, Chip2_pin4<0>]
    Node       Chip2_pin8<1> => [AND1_in1<1>]
    Node       Chip2_pin7<0> => [AND2_in1<0>]
    Node       Chip2_pin6<0> => [AND3_in1<0>]
    Node       Chip2_pin5<0> => [AND4_in0<0>]

Download it

If you want to try it out yourself, or you want more detailed info about the file format, grab the latest source code here:


Robot Odyssey Mouse Hack 1

Yesterday I spent some more time reverse engineering Robot Odyssey. This was a great game, and it’s kind of a nostalgic pleasure for me to read and figure out all of this old 16-bit assembly. So far I’ve reverse engineered nearly all of the drawing code, big chunks of the world file format, and most of the code that’s responsible for moving around objects on the screen.

So, I thought I’d try manipulating some of that data. I extended my existing binary patch to add even more code to the game’s per-frame blit function. It sets up the mouse using DOS Int 33h, and lets you manipulate the game’s table of sprite locations by dragging objects with the mouse.

I took a video to show the results. It’s still obviously a hack, but it may actually be kind of useful for assembling circuits more quickly.

The source code is available, but it’s a bit rough. This patch is getting kind of unwieldy in its current state, and I think my next step will be coming up with a better tool for doing these kinds of patches. I’m thinking either:

  • A pre/post-processor for NASM, which has binary patching oriented features.It would use signatures (byte mask + md5sum) to identify interesting code and data regions in the original binary, and allow you to replace or invoke snippets of code in the original binary using this mechanism. This should make it easier to do very invasive patches which can apply against slightly different versions of a game.
  • Or, scripting support for DOSBox.This could be a lot like the Lua support in FCEUX which made Super Mario Drag & Drop possible. As much as I like the idea of self-contained binary patches that operate in the same environment the game itself runs in, a Python plugin interface for DOSBox could be extremely powerful. Imagine a real-time level editor, or various kinds of real-time memory comparison tools…

A Binary Patch for Robot Odyssey

Robot Odyssey is one of the games that I have the fondest childhood memories of. It’s both a high-quality educational game, and a gentle (but very challenging) introduction to digital logic.

There’s a Wikipedia article on the game. There’s also DroidQuest which is a Java-based clone of Robot Odyssey. The DroidQuest site also contains some good info on Robot Odyssey itself, including the only walkthrough I’ve ever seen.

So, I recently got inspired to try playing through Robot Odyssey again. As a kid, I never managed to beat the game. For a long time, it was nearly impossible to run it on a modern machine. It required a 5.25″ disk drive due to the ancient copy protection, it has CGA graphics, assumes you’re using an IBM XT keyboard, and all of the timing is based on the 4.77 MHZ CPU frequency of the original XT.

Thankfully there’s DOSBox, a really high quality emulator that can run old games like this quite faithfully. I started trying to play Robot Odyssey on DOSBox, but there were still two big problems:

  • Copy Protection.Robot Odyssey checks to make sure you’re running from the original 5.25″ disks, which have a “flaky bit” on them. If the flaky bit isn’t detected, the game will still load but your soldering iron doesn’t work!
  • Inconsistent speed.DOSBox is really good at slowing down the CPU, but this isn’t an exact science. Some things that were really really slow on the original XT (like writing to the CGA card) are fairly fast on DOSBox, and other things are comparatively too slow. This means you’re constantly futzing with the speed of DOSBox’s CPU emulation, depending on what level you’re in, how many robots are on the screen, etc.

The Patch:

So, I decided to solve these problems (and a few others) by binary patching the game itself. Since there are a bunch of user-tweakable knobs, I figured the best way to distribute this patch was as a Python script which patches the original binaries. You can grab the script from:

If you’re interested in the technical details of how the patch works, the source is pretty well commented. I won’t bore you with that here. This is a list of the patcher’s features:

  • Disables copy protection.This is necessary to run the game on any modern machine, even assuming you have the original disks.
  • Installs a frame rate limiter.Instead of adjusting the CPU speed, this is a real and fairly accurate frame rate limiter. You can specify a desired frame rate on the command line when applying the patch. By default it runs at 8 FPS, which feels about right based on memory. (I don’t have an IBM XT handy for checking what the speed is supposed to be…)
  • Halts the CPU when idle.When the frame rate limiter is sleeping, it yields the CPU. This will help a lot if you’re running the game in a multitasking environment or a virtual machine.
  • Compatibility with Windows XP’s built-in DOS emulation.You need the “-p” flag for this, and the frame rate limiter won’t be as precise- but the game will be playable just by double-clicking it in Windows!
  • “Fast” mode.This is an optional feature, enabled with the “-f” flag, which speeds up the game when keyboard input is waiting. This makes it feel a lot more responsive, and makes it faster to navigate around the level. You can also hold down any repeating key as a very simple “turbo” button.
  • Keyboard compatibility patch.Normally, Robot Odyssey is totally unplayable on any computer without a numeric keypad, including laptops, due to a bug in its keyboard handler. If you enable the “-k” flag, the patcher will rewrite the game’s keyboard mapper to be fully compatible with AT keyboards. This also removes the need to play with Caps Lock on.


To use the patch, you’ll need:

  • Python
  • NASM, a spiffy assembler
  • Original Robot Odyssey binaries.Make sure the binaries you have aren’t already patched or cracked in any way. I won’t distribute these myself (so don’t ask!) but there are numerous abandonware sites on the web which should have this game. I’m not sure if multiple revisions of this game were produced. This patcher tries to be pretty lenient, but I’ve only tested it with one version. For reference, these are the SHA-1 hashes from my copy of Robot Odyssey:
    756a92e6647a105695ac61e374fd2e9edbe8d935  GAME.EXE
    692a9bb5caca7827eb933cc3e88efef4812b30c5  LAB.EXE
    360e983c090c95c99e39a7ebdb9d6649b537d75f  MENU2.EXE
    a6293df401a3d4b8b516aa6a832b9dd07f782a39  MENU.EXE
    12df28e9c3998714feaa81b99542687fc36f792f  PLAY.EXE
    bb7b45761d84ddbf0a9e561c3c3603c7f65fd36d  SETUP.EXE
    e4a1e59665595ef84fe7ff45474bcb62c382b68d  TUT.EXE
  • Something that can run DOS games with CGA graphics! This could be a PC booted into DOS, a Windows machine, DosBOX…

Before you apply the patch, make backup copies of all your game binaries:

micah@carrot:~/robot$ mkdir original
micah@carrot:~/robot$ cp *.EXE original/
micah@carrot:~/robot$ ls original/

Now apply the patch to each binary. Each section of the game (Robotropolis, the Innovation Lab, and the Tutorials) have their own separate EXE file, each of which has a separate copy of the game engine. You can use the same or different settings for each.

For example, to patch all binaries with default frame rate, and with the keyboard patch enabled:

micah@carrot:~/robot$ python robot_odyssey_patcher.py original/GAME.EXE GAME.EXE -k
Found copy protection. Disabling...
Found blitter loop. Patching...
Found keyboard mapper. Patching...
Saving comment at 0x1a4d0
micah@carrot:~/robot$ python robot_odyssey_patcher.py original/TUT.EXE TUT.EXE -k
Copy protection not found.
Found blitter loop. Patching...
Found keyboard mapper. Patching...
Saving comment at 0x11380
micah@carrot:~/robot$ python robot_odyssey_patcher.py original/LAB.EXE LAB.EXE -k
Found copy protection. Disabling...
Found blitter loop. Patching...
Found keyboard mapper. Patching...
Saving comment at 0x152a0

Now run PLAY.EXE in Windows, DOSBox, etc. You should see the game running at a steady 8 FPS, and the non-numpad arrow keys should work.

Experiment with the options! The “-h” option gives you a full list of the available setitngs. For example, if you want the game to run a bit faster, you might add “-f -r 10“. This will run the game at 10 FPS, and speed it up when there’s keyboard input. Remember to add “-p” if you’re running in the Windows DOS emulation.

This patcher may also work for games other than Robot Odyssey, which were based on the same engine. For example, Gertrude’s Secrets and Rocky’s Boots. You may have to leave off the “-k” option, since these games don’t necessarily use the same keyboard mapping as Robot Odyssey.

Enjoy exploring Robotropolis!

Self-contained TED receiver

My previous entry introduced a homebrew receiver for the powerline-based data protocol used by The Energy Detective. I just designed a second revision of that receiver. This one is self-contained: It gets power and modulated data from a 9V AC wall-wart transformer, and decoded data leaves via an RS-232 serial port at 9600 baud. Best of all the circuit is very simple: Just an 8-pin microcontroller and a single op-amp.

Major changes in this version:

  • DC power for the circuit is now provided by the 9V AC input, instead of a separate power supply. Previously this would have caused unacceptable levels of harmonic distortion in the input signal. In the new design, this is mitigated by an inductor (which forms an LC low-pass filter), and by the lower power consumption of a single modern op-amp versus three ancient op-amps.
  • By using a simpler filter design and a modern op-amp with a gain-bandwidth product of at least 10 MHz, the bandpass filter and amplifier can be built using only a single op-amp.

Note that the MAX475 op-amp I’m using has been discontinued by the manufacturer, and it’s now hard to find. I just used it because I had one handy in my junk drawer. I’ll verify this design with other op-amps as soon as I can, but it should work with just about any op-amp which can operate on a single-ended 5V supply, and which has a high enough GBW.

Firmware and schematics (PNG and EAGLE formats) are in Subversion and more info on the theory of operation was presented in my previous blog entry.

Interfacing with The Energy Detective

I recently bought The Energy Detective (TED), a pretty inexpensive and friendly way to keep tabs on your whole house’s electricity usage. It’s a lot like having a more featureful version of your utility company’s power meter, sitting on your kitchen counter. It can estimate your utility bill, and tell you how much electricity and money you’re using in real-time. The resolution is pretty good- 10 watts, 1 second.

As a product, I’ve been pretty happy with it. It does what it claims to, and the measurements seem to be fast and accurate. Of course, being the crazy hacker I am, I wanted to interface TED with other things. I don’t have a home automation system as such, but I did want to get real-time graphs of my electricity usage over various time periods. I also wanted the possibility to use TED as an information feed for various other display devices around the house. (But that’s a topic for another blog post…)

The stock TED package consists of two pieces: A measurement/transmit unit (MTU) and receive/display unit (RDU). The MTU has a pair of current transformers, and it installs entirely in your house’s breaker panel. It takes the power readings every second, and transmits them over the power lines to the RDU, which is just a little LCD panel that plugs into the wall. The RDU has a USB port, which you can use with TED’s “Footprints” software.

So, hoping that the USB port would do what I want, I bought the Footprints software for $45. The TED system itself is, in my opinion, a really good value. The Footprints software is not. As a consumer, I was disappointed by two main things:First of all- the UI lacks any polish whatsoever. It looks like a bad web page from the 90’s. Second of all, the data collection is not done in hardware, it’s implemented by a Windows service. This means you can’t collect data to graph unless you have a Windows PC running. Not exactly a power-efficient way to get detailed power usage graphs.

As a hobbyist, a few more things frustrated me about Footprints. The implementation was pretty amateurish. Their Windows service runs a minimal HTTP server which serves up data from an sqlite database in XML. The front end is actually just a Flash applet masquerading as a full application. Energy Inc, the company behind TED, has an API for Footprints: but you have to sign a legal agreement to get access to it, and I wasn’t able to get any details on what the API does and doesn’t include without signing the agreement. So, I opted not to. It would be much more fun to do a little reverse engineering…

So, I did. The end result is that I now have two ways of getting data out of my TED system.

Using the USB port

The TED RDU’s USB port is actually just a common FTDI usb-to-serial adapter. The RDU sends a binary packet every second, which includes all of the data accessible from the Footprints UI. This includes current power usage, current AC line voltage, utility rates, month-to-date totals, and anything else you’ve programmed into your RDU.

There has been some prior work on reverse engineering this protocol. The Misterhouse open source home automation project has a Perl module which can decode the TED messages.

Unfortunately, the Perl module in Misterhouse won’t work with more recent versions of the RDU like mine. The recent RDUs have a different packet length, and they require a polling command to be sent before they’ll reply with any data.

I found the correct polling command by snooping on Footprints’ serial traffic with Portmon. I also noticed a packet framing/escaping scheme, which explains some of the length variations that would have broken the module in Misterhouse.

The result is a Python module for receiving data from a TED RDU. It isn’t terribly featureful, but it should be pretty robust.

Direct from the wall socket

Now for the more exciting method: What about reading data directly from the power line, without using the TED receive/display unit at all? This could provide some exciting opportunities to embed a small and cheap TED receiver inside of other devices, and it would provide some insight on what exactly is being transmitted by the box in my breaker panel.

The TED RDU is pretty simple internally: A Dallas real-time clock, PIC18 microcontroller, chip-on-glass LCD, some buttons, and the TDA5051A, a single-chip home automation modem from Philips. This chip can receive and transmit ASK modulated signals at 1200 baud, with a carrier frequency of around 132 kHz.

Digi-key carries the TDA5051A, but I figured it would be more educational (and more hobbyist-friendly) to try and build a simpler receiver from scratch using only commonly available parts. The result is the following design, with an 8-pin AVR microcontroller and three op-amps:

Update: There is now a Revision 2 of the schematic, which uses only a single power supply and one op-amp.

  1. The power line is sampled via a 9V AC wall wart. With this design, it needs to be a separate isolated power supply. So far, I haven’t had any luck with using this same transformer to power the circuit. Any rectifier introduces high-frequency harmonics which drastically degrade the signal-to-noise ratio.
  2. The first stage is a 10x amplifier and 138kHz band-pass filter. This is from Texas Instrument’s “Filter Design in Thirty Seconds” cheat-sheet.
  3. The next stage is an amplifier and high-pass filter, to remove the last remnants of 60 Hz ripple.
  4. The third stage is just for amplification. In a design which used higher quality op-amps, this stage may not be necessary.
  5. After the third op-amp, the signal passes through an RC network which AC couples it, filters out high frequency noise which could cause glitches in the microcontroller’s I/O pin, and limits the current into the micro’s clamping diodes.
  6. At this point, we have an amplified digital signal which is still ASK-modulated:

The rest of the work happens in software. The ATtiny85 keeps itself quite busy:

  1. The AVR first applies a narrow digital band-pass filter, to strip out any ringing or other noise that remains after the analog band-pass filter.
  2. Next, a digital low-pass filter. This passes the 1200 baud serial data, but rejects higher frequency glitches.
  3. A threshold is applied, with hysteresis, to convert this data into a stream of ones and zeros.
  4. Next is a relatively typical software serial decoder, with majority-detect. The ASK modulated data has 8 data bits, 1 start bit, and 2 stop bits. Polarity is slightly different than typical RS-232. “1” is the presence of an ASK pulse, “0” is the absence of a pulse. Start bits are “1”, stop bits are both “0”.
  5. Using this serial engine, we receive an 11-byte packet.
  6. The packet is converted from TED’s raw format into a more human-readable (but still machine-friendly) serial format.
  7. The reformatted data is finally output via a software serial port at 9600 baud.

The end result, as viewed from a PC connected to the serial output pin:

HC=245 KW=001.324 V=121.138 CNT=023
HC=245 KW=001.335 V=121.135 CNT=024
HC=245 KW=001.348 V=121.072 CNT=025
HC=245 KW=001.345 V=121.021 CNT=026
HC=245 KW=001.345 V=121.044 CNT=027
HC=245 KW=001.324 V=121.152 CNT=028
HC=245 KW=001.314 V=121.280 CNT=029
HC=245 KW=001.314 V=121.306 CNT=030
HC=245 KW=001.311 V=121.297 CNT=031
HC=245 KW=001.349 V=121.232 CNT=032
HC=245 KW=001.347 V=121.228 CNT=033

The “KW” and “V” columns are self-explanatory. “HC” is my TED’s house code, and “CNT” is a packet counter. Normally in increments by one. If it skips any numbers, we missed a packet.

But wait, what’s wrong with this picture? The power and voltage readings have too much precision. The standard TED display unit will give you a resolution of 10 watts for power, and 0.1 volt. As my data above shows, the TED measurement unit is actually collecting data with far more precision. I can only guess why TED doesn’t give users more precision normally. I suspect they removed it because the extra precision may imply extra accuracy that may not exist.

So, what is TED actually sending? Once a second, it broadcasts an 11-byte packet over your power lines at 1200 baud:

  • Byte 0: Header (always 0x55)
  • Byte 1: House code
  • Byte 2: Packet counter
  • Byte 3: Raw power, bits 7-0
  • Byte 4: Raw power, bits 15-8
  • Byte 5: Raw power, bits 23-16
  • Byte 6: Raw voltage, bits 7-0
  • Byte 7: Raw voltage, bits 15-8
  • Byte 8: Raw voltage, bits 23-16
  • Byte 9: Unknown (Flags?)
  • Byte 10: Checksum

Pretty straightforward. I don’t know the actual A/D converter precision in the measurement/transmit unit, but both the power and voltage readings are sent on the wire as 24-bit raw numbers. I’m still not sure what byte 9 is. In my measurements, it hovered around 250, sometimes jumping up or down by one. It may be some kind of flag byte, or maybe it measures power line frequency? The checksum is dead simple: Add every byte in the packet (modulo 256), and the checksum byte ensures the result is zero.

To figure out what to do with these raw measurements, I collected data simultaneously with my circuit and with the TED RDU’s USB interface. Both sets of results went into a spreadsheet. After removing outliers, I did a linear regression. The resulting linear function is what you’ll find in the current firmware for my homebrew receiver. The following plots from the spreadsheet are a pretty striking illustration of the additional precision available via my raw interface:

The top graph shows power line voltage as recorded by the TED RDU. The second graph shows the raw values I’m receiving from the TED MTU. The bottom graph shows the correlation between the two, in blue, and my linear regression, in red.

I plan to keep improving the circuit. Hopefully there’s a way to get both data and power from a single supply, without dealing with any annoying high-voltage circuitry. If there is any interest, I might make a kit available. If you’re interested, post a comment and let me know what features you’d like. USB port? Serial port? Display?

Using an AVR as an RFID tag

Experiments in RFID, continued…

Last time, I posted an ultra-simple “from scratch” RFID reader, which uses no application-specific components: just a Propeller microcontroller and a few passive components. This time, I tried the opposite: building an RFID tag using no application-specific parts.

Well, my solution is full of dirty tricks, but the results aren’t half bad. I used an Atmel AVR microcontroller (the ATtiny85) and a coil. That’s it. You can optionally add a couple of capacitors to improve performance with some types of coils, but with this method it’s possible to build a working RFID tag just by soldering a small inductor to an AVR chip:



The above prototype emulates an EM4102-style tag- a very popular style of low-frequency RFID tag which stores a 40-bit unique ID. I can read my bogus ID value (0x12345678AB) using Parallax’s RFID reader. Below is another prototype, with a larger coil and a couple of capacitors for added range and stability. It is programmed to emulate a HID prox card, a simple FSK-modulated tag with a 44-bit payload. I can read this card successfully with my garage door opener. This one is a little large to conveniently carry around, but a smaller AVR package should help.


AVRFID Tag Prototype

So, the shiny electrical tape is beautiful, but how does this thing even work? The power pins on the microcontroller aren’t even connected!

As I said, this makes use of several dirty tricks:

  • The coil actually powers the AVR through two of its I/O pins. Nearly every chip out there has clamping diodes on its I/O pins, which prevent voltages on that pin from rising above the chip’s supply voltage or sinking below ground. These diodes are useful for arresting static discharge.When you first hold the RFID tag up to a reader, the chip has no power- the supply voltage is zero. When the coil starts to pick up power from the RFID reader, these two I/O pins are presented with a sine wave, a few volts in amplitude. Anywhere that sine wave exceeds the supply voltage, some energy is diverted from the coil to the chip’s supply rails, via the clamping diode. The end result is that the chip is powered, and the coil’s sine wave is truncated. The top and bottom of the sine have been chopped off, and it looks a lot more like a square wave now.
  • Power filtering using the AVR’s die capacitance. In the smaller prototype, there is no power filtering capacitor at all. In fact, the power is filtered by the internal capacitance of the power planes in the AVR’s silicon die. This isn’t much, but it makes the power supply stable enough that we can execute code even though the supply is pulsing at 125 kHz.
  • Very low voltage operation. This particular ATtiny85 chip is specified for operation at voltages as low as 2.5v. The extended voltage range version (I didn’t have any of these handy) is specified down to 1.8v. But I’m running these AVRs at barely over 1 volt. At these voltages, the normal AVR clock oscillators don’t work- but I can get away with this because of the next hack…
  • The coil is the AVR’s clock source. The inductor isn’t just hooked up to any I/O pin: it’s actually connected to the AVR’s clock input. Remember the square-ish wave we’re left with after the clamping diodes suck away some power? That waveform is now our clock input. The microcontroller is executing code at 125 kHz, in lockstep with the RFID reader’s carrier wave.
  • Firmware? What firmware? At such low speeds, the chip’s firmware looks less like a program, and more like a sequence of I/O operations to perform in sync with each carrier clock cycle. There aren’t a lot of cycles to spare. In the EM4102 protocol, you could potentially do some useful work with the 32 clock cycles you have between each bit. With the HID protocol, though, you need to output an FSK edge as often as once every 4 clock cycles. As a result, the firmware on the RFID tag is extremely dumb. The “source code” is really just a set of fancy assembler macros which convert an RFID tag code into a long sequence of I/O instructions.

The fact that this thing works at all is quite a testament to the robust design of the AVR. The latest AVRFID source is in Subversion, as usual.