Another Reason to Beware Bargain Basement Bluetooth

I was debugging a Bluetooth-related problem on a Windows 7 machine recently, and I found another great example of why sometimes you get what you pay for, even when buying something as nominally standardized and homogeneous as a Bluetooth adapter. It so happens that this machine was using one of these $5 adapters with the fake antenna:

That was the first thing I noticed about these adapters. The second was that each and every one of them had the same address, 11:11:11:11:11:11. Someone didn’t feel like paying for a MAC OUI.. or an EEPROM.

Anyway, back to the present… I had been noticing that sometimes when sending a file to my cell phone from a Windows 7 machine, it would time out. It didn’t happen every time, but it seemed to time out more often than not. I dig a little deeper, and it looks like the file transfer application (fsquirt.exe) is never even getting a chance to open a connection to the phone! It is timing out far before that even happens. What’s the deal?

To make sense of this, it helps to understand a bit more about how the Bluetooth stack works in Windows 7. First a big caveat: I do not work at Microsoft, nor do I have any confidential information about the inner workings of Windows. All I have, really, are informed guesses based on my observations and perhaps a little reverse engineering to satisfy my curiosity.

The main actors here are:

  • fsquirt.exe — The GUI wizard which walks you through sending a file to a Bluetooth device. It lets you choose a device, then it sends the file via the standard OBEX protocol.
  • bthprops.cpl — Technically this is a control panel, but for historical reasons it also implements the public API for Microsoft’s Bluetooth stack. This includes entry points for enumerating attached radios, inquiring for nearby devices, and setting up service discovery records for the local machine. Applications can link to this as a DLL.
  • fdbth.dll — Microsoft’s applications, like fsquirt, actually don’t use the public APIs from bthprops.cpl to inquire for nearby Bluetooth devices. Those public APIs are synchronous: they don’t return until the Bluetooth inquiry is totally finished. It is of course much more user-friendly to display Bluetooth devices as they are discovered, and Microsoft’s apps do this by using an undocumented Function Discovery provider.
  • bthserv.dll — Some operations in bthprops.cpl go directly to the bthport driver using device ioctls, but there is a system-wide service which maintains a database with the state of all known Bluetooth devices. It’s also responsible for maintaining local SDP records. The APIs in bthprops.cpl communicate with bthserv.dll via local RPC calls.
  • bthport.sys — This is the kernel-mode driver which does most of the work. It receives remote name requests, inquiries, and many other operations in the form of ioctls from user-space code.

The important fact I learned while investigating this bug: Inside bthport.sys, there is a global FIFO queue of actions which contact remote devices. If anyone asks to read a device’s name, to perform an inquiry, or to establish an ACL connection with a remote device, it goes in this queue. From Microsoft’s perspective this probably simplifies handling unreliable Bluetooth radios quite a lot, since we never rely on the radio to handle more than one of these operations at a time. Additionally, Inquiry operations are rather heavy-weight, and depending on your link manager settings, you may not be able to do anything else while they’re in progress.

So, back to the hang. When I traced the communications between the USB dongle and the Windows 7 Bluetooth stack, I found this:

The setting: fsquirt has just finished the wizard pane where we’re looking for nearby devices. It notifies fdbth that it no longer needs to keep looking. But in its haste to prettify the UI as much as possible, it has already started asking for human-readable names for all of the Bluetooth devices within earshot. Those requests go into bthport’s queue. The fsquirt app wants to establish an ACL connection to my phone already, but it’s waiting in line behind one of these name requests.

In case you don’t have the Bluetooth spec memorized (I sure don’t), here’s a play-by-play of the USB log:

  • OUT 05 04… — The PC asks the Bluetooth dongle to get a human-readable device name for one of the nearby devices.
  • IN 0F… — Command is in-progress.
  • (Over 20 seconds elapse)
  • IN 03 0B 08… — Connection request finished, error code 08 (Timed out)

But wait! We never asked the device to create a connection! We asked it to read the other device’s name! This looks like a pretty grievous firmware bug in this decrepit little dongle. As we see below, Windows is pretty confused too. My “waiting” and “timed out” notes indicate the state of the fsquirt wizard. The timestamps aren’t saved in this log, but fsquirt is only willing to wait 10 seconds or so before giving up. It’s waiting on bthport, and bthport is waiting on any response, even a timeout, from the Bluetooth adapter.

And it keeps waiting, for 70 seconds. Why so long? It should never take that long for a Bluetooth device that’s actually present to respond. Well, actually it isn’t waiting on the remote device. It’s waiting on the radio itself. The radio is programmed to time out after a much shorter interval specified by Windows, in this case about 20 seconds. Windows is relying on the radio to implement this timeout. What we’re seeing here is an even longer failsafe timeout, to catch errors with the local radio. Like the one we just experienced.

When Windows finally gives up on waiting for our buggy radio to respond correctly to the remote name timeout, we can see it issue the next command. This time it’s the Create Connection command that fsquirt had been waiting on. But fsquirt is already long gone, and the user is already frustrated.

If we’re lucky, the user throws the buggy Bluetooth dongle out the window, not their PC.

Cube64 GameCube to N64 Adaptor

Enjoy retro N64 games, but can’t stand the controller? That’s the situation I found myself in about 7 years ago, back in 2004. So I built an adaptor, to use Game Cube controllers on the N64.

The adaptor hardware is very simple- all you need is a PIC microcontroller. I originally designed the project to work with the very popular at the time PIC16F84A, or some smaller 8-pin chips. It bit-bangs both protocols, so you don’t need any more hardware than a tiny ยตC, a couple resistors, and optionally either a voltage booster or battery pack to run the “rumble” vibration motor.

I had a lot of fun building it, since it was an opportunity to reverse engineer a protocol that, as far as I could tell, had never been documented publicly. There were many web sites explaining the basics of the N64 and Game Cube protocols- enough to talk to the controllers with your own hardware. But there was significantly more to learn about emulating an N64 controller, since there are many features that you don’t really need to use the controller, but which games will use. The N64 also has the added complexity of having a memory card slot. The controller implements a protocol that tunnels SRAM reads/writes over the controller wire. Peripherals like the Rumble Pak pretend to be SRAM, but are actually memory-mapped I/O devices.

This is a very old project, but I thought I’d do a quick post with an update on it. I never really released the project, I just quietly posted the source code and the CIA feed. Since then, many others have found the project and built their own.

I recently heard from Jacques Gagnon, who went a step farther. He was frustrated by a few lingering bugs, most notably lack of support for the WaveBird wireless controllers. So he pulled the project into Google Code and has been hacking away! WaveBird controllers work now, and he’s added several other new features, such as the ability to store multiple sets of button mapping layouts in memory.

If you’re still interested in this classic gaming platform and you’d love to have your own Cube64 adaptor, I highly encourage you to check out Jacques’ work. The adaptors are easy to build, especially if you already have some experience with microcontroller programming.

Temporal Hex Dump

After building some hardware to trace and inject data on the Nintendo DSi’s RAM bus, it became obvious pretty fast that there’s a lot of data there, and (as far as I know) no good tools for analyzing these sorts of logs.

The RAM tracer has already given us a lot of insight into how the DSi works by virtue of letting us inspect the boot process, the inter-processor communication, and most of the code that runs on the system. But all of that knowledge comes in an indirect way, from using the RAM tracer as a platform to run other experiments. I’ve been interested in figuring out whether there’s a way to use the RAM trace itself to help understand a system’s dynamic behaviour.

The RAM is on a packet-oriented bus, so it would make sense to have a tool that looks kind of like a packet-based protocol analyzer. Think Wireshark, but for memory.

But there are also a lot of complex patterns that show up over time. As the DS loads a file, or initializes itself, or renders frame after frame of a UI, there are obvious patterns that emerge. So it also might make sense to have a visual tool, like vusb-analyzer.

Unfortunately, both of these approaches ignore the spatial organization of memory. The bus is a stream of packets that say ‘read’ or ‘write’, but the contents of RAM as a whole is more like a file that’s changing over time. Like in a version control system.

So the tool I’ve been imagining is kind of a hybrid of these. It would have a graphical timeline that helps you visually navigate through large datasets and identify timing patterns. It would have a packet-by-packet listing of the reads and writes. And most importantly, it would be a hex dump tool. But instead of showing a hex dump of a static file, it would be a two-dimensional hex dump. The hex dump shows space, but you can also scrub forward or backward in time, and watch the hex dump change. The hex dump could be annotated with colors, to show which data is about to change, or which data recently changed. You could right click on a byte, and see hyperlinks to the memory transactions that are responsible for that byte’s previous and next values.

As far as I know, nobody’s written a tool like this. So I have no idea how useful it will actually be for reverse engineering or performance optimization, but it seems like a promising experiment at least. So far I’ve been working on an indexing and caching infrastructure to make it possible to interactively browse these huge memory dumps, and I’ve been working on the visual timeline widget. Here’s a quick screencast:

The top section shows read/write/zero activity binned by address, with each vertical pixel representing about 64 kB. The horizontal axis is time, with continuous zooming. The bottom section of the graph shows bandwidth, color-coded according to read/write/zero. Blue pixels are reads, reds are write, and orange is a write of a zero byte.

This log file is about a gigabyte of raw data, or about 2 minutes of wallclock time. It shows the Opera browser on the Nintendo DSi loading a very large web page, then crashing. You can see its heap growing, and you can watch the memory access patterns of code, data, and inter-processor communication.

There’s a lot of room for improvement, but I’m optimistic that this will be at least a useful tool for understanding the DSi, and maybe even a more generally applicable tool for reverse engineering and optimization.

As usual, the source is in svn if anyone’s interested. It’s implemented with C++, wxWidgets, sqlite3, and Boost. I’ve only tested it on Linux, but it “should” be portable.

DSi RAM tracing

It seems like a lot of people have been seeing my Flickr photostream and wondering what I must be up to- especially after my photos got linked on hackaday and reddit. Well, I’ve been meaning to write a detailed blog post explaining it all- but I keep running out of time. So I guess a “short” blog post is better than nothing.

The executive summary is that I’ve been working on making it possible to run homebrew games on the DSi. If you’ve used The Homebrew Channel on the Wii, you know what I’m talking about- we’d basically like to have the same thing for the DSi. And as it turns out, the DSi has a lot in common with the Wii.

I haven’t been working on this project alone. I’ve been collaborating with the excellent software and hardware engineers in Team Twiizers, the group responsible for The Homebrew Channel. Bushing has written a couple blog posts about our DSi efforts so far: one on the RAM tracing effort in general, and one with a sample trace.

My role in this project lately has been building hardware and software for a tool which is effectively a low-level CPU debugger for the DSi. It attaches to the RAM bus between the RAM and CPU, it’s capable of tracing all traffic in ‘real-time’ (after slowing down the system clock so the required bandwidth fits in USB 2.0), and just this week I got RAM injection working. The hardware interposes on one of the chip select signals between the CPU and RAM. If an address floats by that we want to patch, the hardware can quickly disable the real RAM chip and drive new data onto the bus as the CPU continues to read. It’s a very effective technique, and my hardware has a pretty capable system for storing and retrieving up to 16 kB of patch data now.

So why is all the complex hardware needed? Well, we hit two big dead-ends:

  • Several of us managed to run code from within a DSi-enhanced game using modified save files, but most of the interesting hardware on the DSi (like the SD card slot, and the internal flash memory) is disabled when these games are running. The earlier Team Twiizers Classic Word Games hack, WinterMute’s cookhack, and my own cookinject all suffer from this problem.
  • I’ve been able to read and write flash memory for a while. This isn’t as interesting as it may sound, since nearly all of the flash is encrypted. However, by running a lot of tests where we modified or compared the encrypted data, we were still able to gain a lot of knowledge about the layout of the flash memory, and some information about how the system boots.

What we’d really like to do is understand more about how the console’s main menu works, and how software is installed and loaded on the DSi. The best way to do this seemed to be RAM tracing. It required an FPGA and a lot of crazy soldering under a microscope, but now it’s working and we’re well on our way to unlocking the secrets of DSi homebrew.

If you’re interested in following our efforts to make homebrew possible on the DSi, definitely subscribe to Bushing’s blog at hackmii.com. He understands the Wii (and by extension, the DSi) better than I ever will, and he’s great at writing about it.

I have a Flickr set with lots more photos, but these are two of my favorite. The first is a photo of the entire setup as it’s currently configured. The second is a picture of the craziest of the microscopic solder joints in this project. This is where I separated the chip-select signal I mentioned above, giving the FPGA the ability to sit between the CPU and RAM. The trace I’m tapping into is buried under four other PCB layers. For a sense of scale, the orange enameled wire is 32 AWG, or about 0.2mm in diameter. Normal 30 AWG wire-wrapping wire is actually only the third smallest size of wire I’ve been using in this project ๐Ÿ™‚

Debugging setup with new FPGA

Via excavation closeup

If you’re interested in the C and Verilog source code, it’s in Subversion as usual.

DSi hacking, BGA rework

I got a Nintendo DSi earlier this month. It’s a really cute little console, and a nice revision to the DS Lite. Screens are slightly larger, CPU is about twice as fast, 4x the RAM. And of course it has these camera thingies. Other folks have teardowns online with internal photos aplenty ๐Ÿ™‚

So, I spent a week or so tooling around with the DSi doing normal consumer and/or homebrew things with it. Played some The World Ends With You, got an Acekard 2i and tried running some homebrew. Its DS compatibility mode seems to be pretty good. The only compatibility bug I bumped into was running my work-in-progress Robot Odyssey port. It mostly worked, but there was a little bit of crackling in the audio. Not surprising, I’m doing the audio in a somewhat unconventional way, and it doesn’t work properly in any DS emulator either ๐Ÿ™‚

The DSi is pretty cute, but it would be nice if we could use its new features for homebrew software. It would be especially cool if you could run homebrew directly from its built-in SD card reader, without the need for any special-purpose hardware. That would really open the door to some cool indie games that nearly anyone could play.

Need to satisfy my curiosity, then. Out come the jewler’s screwdrivers, antistatic mat, and SMD rework tools…

There are a few other people working on hacking the DSi from other avenues. Team Twiizers just released a video of a save game exploit that they’re using to run code in DSi mode. This is pretty cool, and as far as I know it’s the first time a hobbyist has been able to run code in DSi mode. I’m not sure how much if any of the new hardware they’ve managed to reverse engineer through this hack.

There are also a handful of Wii hackers that have noticed that the DSi has pretty much the same file format for its downloadable games, so they’re trying to come up with a way to install homebrew DSiWare apps. Unfortunately, they need the system’s private key- and brute-forcing it doesn’t seem too promising. One way to get this private key would be to extract it from the DSi’s firmware.

The thing that interests me the most right now is figuring out how the DSi boots. There are plenty of other layers of software that you can exploit from the outside- but the bootloader and built-in firmware seems like the best starting point for any serious effort to reverse engineer the DSi’s hardware.

The DSi has upgradeable firmware, and there’s only one visible place that it could be stored: a 256 MB NAND flash chip. This is actually an embedded MMC (eMMC) chip from Samsung. It’s electrically pretty much the same as an MMC card, but in a tiny BGA form factor. It is wired directly up to the CPU, which would indicate that there is some kind of hardware or software bootloader that knows how to pull code off of the eMMC chip during early boot. This probably means there’s a mask ROM in the same package as the CPU. It would also make sense that this ROM would know how to write an initial firmware image to the eMMC chip during manufacturing, since there is no other way to access the eMMC chip after it’s soldered to the DSi’s main board.

I was hoping that the MMC data lines would be accessible via some of the many test points. None of them were obviously related to the NAND, though. I tried probing the SPI bus, where the original DS kept its firmware, but the eMMC chip was definitely on a separate bus. It didn’t help at all that the DSi refuses to boot unless every single one of its peripherals are attached, so this makes it rather hard to poke around the board’s test points while it’s running.

At this point I assumed that the traces running from eMMC chip to CPU were completely buried. My last recourse: desolder the eMMC chip, and hook it up to some kind of test jig that would let me read back its contents. Then perhaps I could copy it to a normal MMC card, and wire up an MMC socket on the DSi.

After practicing my BGA rework technique a bit, I desoldered the eMMC chip. If you power the DSi on without the eMMC chip, it displays a nice hexadecimal error code. This would seem to indicate that there is indeed a mask ROM in the CPU package- a hardware bootloader wouldn’t be smart enough to display a “nice” error like this.

Under the NAND flash

There was a pleasant surprise underneath the eMMC chip: All of the data signals were in fact accessible on nearby components or vias. I didn’t find them earlier, since I couldn’t really poke around with the ‘scope and have the board access the eMMC at the same time. But, armed with this knowledge, I could switch to a simpler approach: Solder the eMMC chip back on, then build a passive sniffer that tells me exactly what reads and writes the DSi’s CPU makes.

By some miracle, I managed to get the eMMC chip soldered back on successfully. The DSi’s mainboard is a huge pain to do BGA rework on, at least with the tools I have. It has so many ground/power planes and fills that the board is really hard to keep hot with just a hot-air rework tool.

My poor DSi then is ‘mostly’ back in one piece. Now I’m working on the next step: Hooking up an FPGA that can dump all the memory reads/writes over USB ๐Ÿ™‚

More stuff: There are a few more pictures on flickr, and I started a thread with a few more technical details on the gbadev forums.

Robot Odyssey DS: First screenshots

This is nowhere near ready for prime-time, but:

img_0635.jpg

img_0639.jpg

Yep, it’s Robot Odyssey for the Nintendo DS. I literally just got this working yesterday, so please don’t ask for any precompiled binaries. If you don’t already know where the source code is, you really don’t want to see it ๐Ÿ™‚

Before you ask, this is not a general-purpose DOS emulator for the DS. It’s actually a static binary translator which does most of the work in porting a DOS game to the DS, but there’s still an awful lot of manual intervention required. I only really developed the translator with Robot Odyssey in mind, so there are sure to be features missing that you’d need in order to use it with other games.

My intent is to turn this into a real port of Robot Odyssey to the DS, not just an emulation. In particular, I’d like to re-do the load/save game UI to support timestamps and thumbnails, and I’d like a soldering mode that makes use of the DS’s touchscreen.

Robot Odyssey Chip Disassembler

I’ve been spending more time hacking on Robot Odyssesy lately. Most of it has had a specific purpose… I’ll write a separate blog post on that project once it’s a bit more fully baked. In the mean time, the reverse engineering has had some useful side-effects.

Chip Simulation

If you haven’t heard of Robot Odyssey, it’s a game where you build logic circuits that control robots in order to solve puzzles. You can build circuits directly inside the robots, but you can also program them into ‘chips’- small 8-pin integrated circuits. The game also comes with several chips: a clock generator, 4-bit counter, and the infamous “wallhugger”, a chip you can stick in your robot to have it follow the walls of a maze.

Once you ‘burn’ a circuit into a chip, you can load and save it but you can’t see inside. So this has made the built-in chips a bit of a mystery. There’s no way in-game to see how they work, and for all we know they could be ‘magic’ in some way. I had speculated a bit on how they store the compiled chips. Was it some kind of super-optimized machine code? Or maybe some kind of encoded Programmable Array Logic that was especially quick to simulate on an 8086 processor?

Well, it turns out that the code for simulating chips is clever in places, but the file format is quite straightforward. It’s sort of halfway between an electrical netlist and a bytecode language. On every in-game clock tick, every bytecode instruction in the chip executes in order. Instructions can represent logic gates (AND, OR, XOR, NOT, RS flip-flop), or they can enter/exit a nested chip. Bytecode parameters can be one of two data types: The state of a pin, or a list of pin addresses.

Here’s a really simple circuit inside the Robot Odyssey prototype chip:

And this is the top of the .CSV (Chip Save) file that the game produces:

00000000  00 00 00 00 00 00 00 00  01 00 00 00 01 ff 07 00  |................|
00000010  02 00 09 ff 00 03 00 0a  ff ff ff ff ff 00 00 00  |................|
  • First 8 bytes: Pin states. All pins are off.
  • First opcode (01). This is an AND gate.
    • Pin states for the AND’s inputs. Both are off. (00 00)
    • A list of 16-bit addresses that will receive this gate’s output.
      • Address of Pin 2 (00 01)
      • End of list (FF)
  • Second opcode (07). This exits a chip. Since this is the outermost chip, after this opcode is the end of the chip’s electrical data. The parameters for this opcode are a list of lists which describes ‘nodes’, or places where we need to copy a pin state directly from one place to another.
    • First list:
      • Source address (00 02). This is Pin 3.
      • First destination (00 09). This is the first pin of the AND gate.
      • End of list (FF)
    • Second list:
      • Source address (00 03). This is Pin 4.
      • First destination (00 0a). This is the second pin of the AND gate.
      • End of list (FF)
    • End of list (FF)
  • Afterwards is garbage. In this case, (FF FF FF). This is probably left over in memory at the time the chip was compiled, and doesn’t mean anything. The chip interpreter doesn’t read this data.

Chip Disassembler

So, I figured out (I think) the entire file format, and wrote a Python script to disassemble it into something a little more human-readable. The above example chip disassembles to:

Sample Chip Disassembly

Chip1 {
    HELP       'SPIFFY SAMPLE CHIP'
    HELP       '1'
    HELP       '2 OUTPUT'
    HELP       '3 INPUT 1'
    HELP       '4 INPUT 2'
    HELP       '5'
    HELP       '6'
    HELP       '7'
    HELP       '8'

    PIN        Chip1_pin2<0> 'out'
    PIN        Chip1_pin3<0> 'in'
    PIN        Chip1_pin4<0> 'in'

    AND        AND1_in0<0> AND1_in1<0> => [Chip1_pin2<0>]
    Node       Chip1_pin3<0> => [AND1_in0<0>]
    Node       Chip1_pin4<0> => [AND1_in1<0>]
}

Wallhugger Disassembly

For a more exciting example, now we can finally see inside the wallhugger chip! Converting this to a graphical schematic is left as an exercise to the reader ๐Ÿ™‚

Chip1 {
    HELP       'Wall Hugger'
    HELP       '1 Top thruster    ^ Hook up pins'
    HELP       '2 Left bumper     ^ as described.'
    HELP       '3 Left thruster   ^ This chip will'
    HELP       '4 Bottom bumper   ^ cause a robot'
    HELP       '5 Bottom thruster ^ to follow(hug)'
    HELP       '6 Right bumper    ^ the walls of a'
    HELP       '7 Right thruster  ^ room.'
    HELP       '8 Top bumper      ^'

    PIN        Chip1_pin1<0> 'out'
    PIN        Chip1_pin2<0> 'in'
    PIN        Chip1_pin3<0> 'out'
    PIN        Chip1_pin4<0> 'in'
    PIN        Chip1_pin8<1> 'in'
    PIN        Chip1_pin7<1> 'out'
    PIN        Chip1_pin6<0> 'in'
    PIN        Chip1_pin5<0> 'out'

    OR         OR1_in0<0> OR1_in1<0> => [Chip1_pin5<0>]
    OR         OR2_in0<0> OR2_in1<0> => [OR1_in1<0>]
    OR         OR3_in0<0> OR3_in1<0> => [Chip1_pin3<0>]
    OR         OR4_in0<0> OR4_in1<0> => [OR3_in1<0>]
    AND        AND1_in0<0> AND1_in1<1> => [OR2_in1<0>, OR4_in1<0>]
    OR         OR5_in0<1> OR5_in1<0> => [Chip1_pin7<1>]
    OR         OR6_in0<0> OR6_in1<0> => [OR5_in1<0>]
    AND        AND2_in0<0> AND2_in1<0> => [OR2_in0<0>, OR6_in1<0>]
    AND        AND3_in0<0> AND3_in1<0> => [OR6_in0<0>, OR8_in0<0>]
    OR         OR7_in0<0> OR7_in1<0> => [Chip1_pin1<0>]
    OR         OR8_in0<0> OR8_in1<0> => [OR7_in1<0>]
    AND        AND4_in0<0> AND4_in1<0> => [OR4_in0<0>, OR8_in1<0>]
    NOT        NOT1_in0<1> => [AND1_in0<0>, AND3_in0<0>, AND4_in1<0>, AND2_in0<0>]
    OR         OR9_in0<1> OR9_in1<0> => [NOT1_in0<1>]
    OR         OR10_in0<0> OR10_in1<0> => [OR9_in1<0>]
    OR         OR11_in0<0> OR11_in1<1> => [OR9_in0<1>]
    Chip2 {
        PIN        Chip2_pin1<1>
        PIN        Chip2_pin2<0>
        PIN        Chip2_pin3<0>
        PIN        Chip2_pin4<0>
        PIN        Chip2_pin8<1>
        PIN        Chip2_pin7<0>
        PIN        Chip2_pin6<0>
        PIN        Chip2_pin5<0>

        FF<01>     FF1_in0<0> FF1_in1<1> => [Chip2_pin5<0>] []
        OR         OR12_in0<0> OR12_in1<0> => [FF1_in0<0>]
        OR         OR13_in0<1> OR13_in1<0> => [FF1_in1<1>]
        OR         OR14_in0<0> OR14_in1<0> => [OR13_in1<0>]
        OR         OR15_in0<0> OR15_in1<0> => [OR12_in0<0>]
        OR         OR16_in0<0> OR16_in1<0> => [OR15_in1<0>]
        FF<01>     FF2_in0<0> FF2_in1<1> => [Chip2_pin6<0>] []
        OR         OR17_in0<0> OR17_in1<0> => [FF2_in0<0>]
        OR         OR18_in0<0> OR18_in1<1> => [FF2_in1<1>]
        OR         OR19_in0<0> OR19_in1<0> => [OR17_in0<0>]
        OR         OR20_in0<0> OR20_in1<0> => [OR19_in1<0>]
        OR         OR21_in0<0> OR21_in1<0> => [OR18_in0<0>]
        FF<01>     FF3_in0<0> FF3_in1<1> => [Chip2_pin7<0>] []
        OR         OR22_in0<0> OR22_in1<0> => [FF3_in0<0>]
        OR         OR23_in0<0> OR23_in1<1> => [FF3_in1<1>]
        OR         OR24_in0<0> OR24_in1<0> => [OR22_in0<0>]
        OR         OR25_in0<0> OR25_in1<0> => [OR24_in1<0>]
        OR         OR26_in0<0> OR26_in1<0> => [OR23_in0<0>]
        FF<10>     FF4_in0<1> FF4_in1<0> => [Chip2_pin8<1>] []
        OR         OR27_in0<1> OR27_in1<0> => [FF4_in0<1>]
        OR         OR28_in0<0> OR28_in1<0> => [FF4_in1<0>]
        OR         OR29_in0<0> OR29_in1<1> => [OR27_in0<1>]
        OR         OR30_in0<1> OR30_in1<0> => [OR29_in1<1>]
        OR         OR31_in0<0> OR31_in1<0> => [OR28_in0<0>]
        Node       Chip2_pin1<1> => [OR30_in0<1>, OR13_in0<1>, OR18_in1<1>, OR23_in1<1>]
        Node       Chip2_pin2<0> => [OR25_in0<0>, OR31_in1<0>, OR21_in0<0>, OR14_in1<0>]
        Node       Chip2_pin3<0> => [OR20_in0<0>, OR28_in1<0>, OR26_in0<0>, OR14_in0<0>]
        Node       Chip2_pin4<0> => [OR16_in0<0>, OR31_in0<0>, OR21_in1<0>, OR26_in1<0>]
    }
    Node       Chip1_pin2<0> => [OR7_in0<0>, OR10_in0<0>, Chip2_pin2<0>]
    Node       Chip1_pin4<0> => [OR3_in0<0>, OR10_in1<0>, Chip2_pin3<0>]
    Node       Chip1_pin8<1> => [OR5_in0<1>, OR11_in1<1>, Chip2_pin1<1>]
    Node       Chip1_pin6<0> => [OR1_in0<0>, OR11_in0<0>, Chip2_pin4<0>]
    Node       Chip2_pin8<1> => [AND1_in1<1>]
    Node       Chip2_pin7<0> => [AND2_in1<0>]
    Node       Chip2_pin6<0> => [AND3_in1<0>]
    Node       Chip2_pin5<0> => [AND4_in0<0>]
}

Download it

If you want to try it out yourself, or you want more detailed info about the file format, grab the latest source code here:

https://github.com/scanlime/navi-misc/blob/master/python/robot_odyssey_chip_disasm.py

Robot Odyssey Mouse Hack 1

Yesterday I spent some more time reverse engineering Robot Odyssey. This was a great game, and it’s kind of a nostalgic pleasure for me to read and figure out all of this old 16-bit assembly. So far I’ve reverse engineered nearly all of the drawing code, big chunks of the world file format, and most of the code that’s responsible for moving around objects on the screen.

So, I thought I’d try manipulating some of that data. I extended my existing binary patch to add even more code to the game’s per-frame blit function. It sets up the mouse using DOS Int 33h, and lets you manipulate the game’s table of sprite locations by dragging objects with the mouse.

I took a video to show the results. It’s still obviously a hack, but it may actually be kind of useful for assembling circuits more quickly.

The source code is available, but it’s a bit rough. This patch is getting kind of unwieldy in its current state, and I think my next step will be coming up with a better tool for doing these kinds of patches. I’m thinking either:

  • A pre/post-processor for NASM, which has binary patching oriented features.It would use signatures (byte mask + md5sum) to identify interesting code and data regions in the original binary, and allow you to replace or invoke snippets of code in the original binary using this mechanism. This should make it easier to do very invasive patches which can apply against slightly different versions of a game.
  • Or, scripting support for DOSBox.This could be a lot like the Lua support in FCEUX which made Super Mario Drag & Drop possible. As much as I like the idea of self-contained binary patches that operate in the same environment the game itself runs in, a Python plugin interface for DOSBox could be extremely powerful. Imagine a real-time level editor, or various kinds of real-time memory comparison tools…

A Binary Patch for Robot Odyssey

Robot Odyssey is one of the games that I have the fondest childhood memories of. It’s both a high-quality educational game, and a gentle (but very challenging) introduction to digital logic.

There’s a Wikipedia article on the game. There’s also DroidQuest which is a Java-based clone of Robot Odyssey. The DroidQuest site also contains some good info on Robot Odyssey itself, including the only walkthrough I’ve ever seen.

So, I recently got inspired to try playing through Robot Odyssey again. As a kid, I never managed to beat the game. For a long time, it was nearly impossible to run it on a modern machine. It required a 5.25″ disk drive due to the ancient copy protection, it has CGA graphics, assumes you’re using an IBM XT keyboard, and all of the timing is based on the 4.77 MHZ CPU frequency of the original XT.

Thankfully there’s DOSBox, a really high quality emulator that can run old games like this quite faithfully. I started trying to play Robot Odyssey on DOSBox, but there were still two big problems:

  • Copy Protection.Robot Odyssey checks to make sure you’re running from the original 5.25″ disks, which have a “flaky bit” on them. If the flaky bit isn’t detected, the game will still load but your soldering iron doesn’t work!
  • Inconsistent speed.DOSBox is really good at slowing down the CPU, but this isn’t an exact science. Some things that were really really slow on the original XT (like writing to the CGA card) are fairly fast on DOSBox, and other things are comparatively too slow. This means you’re constantly futzing with the speed of DOSBox’s CPU emulation, depending on what level you’re in, how many robots are on the screen, etc.

The Patch:

So, I decided to solve these problems (and a few others) by binary patching the game itself. Since there are a bunch of user-tweakable knobs, I figured the best way to distribute this patch was as a Python script which patches the original binaries. You can grab the script from:

If you’re interested in the technical details of how the patch works, the source is pretty well commented. I won’t bore you with that here. This is a list of the patcher’s features:

  • Disables copy protection.This is necessary to run the game on any modern machine, even assuming you have the original disks.
  • Installs a frame rate limiter.Instead of adjusting the CPU speed, this is a real and fairly accurate frame rate limiter. You can specify a desired frame rate on the command line when applying the patch. By default it runs at 8 FPS, which feels about right based on memory. (I don’t have an IBM XT handy for checking what the speed is supposed to be…)
  • Halts the CPU when idle.When the frame rate limiter is sleeping, it yields the CPU. This will help a lot if you’re running the game in a multitasking environment or a virtual machine.
  • Compatibility with Windows XP’s built-in DOS emulation.You need the “-p” flag for this, and the frame rate limiter won’t be as precise- but the game will be playable just by double-clicking it in Windows!
  • “Fast” mode.This is an optional feature, enabled with the “-f” flag, which speeds up the game when keyboard input is waiting. This makes it feel a lot more responsive, and makes it faster to navigate around the level. You can also hold down any repeating key as a very simple “turbo” button.
  • Keyboard compatibility patch.Normally, Robot Odyssey is totally unplayable on any computer without a numeric keypad, including laptops, due to a bug in its keyboard handler. If you enable the “-k” flag, the patcher will rewrite the game’s keyboard mapper to be fully compatible with AT keyboards. This also removes the need to play with Caps Lock on.

Usage:

To use the patch, you’ll need:

  • Python
  • NASM, a spiffy assembler
  • Original Robot Odyssey binaries.Make sure the binaries you have aren’t already patched or cracked in any way. I won’t distribute these myself (so don’t ask!) but there are numerous abandonware sites on the web which should have this game. I’m not sure if multiple revisions of this game were produced. This patcher tries to be pretty lenient, but I’ve only tested it with one version. For reference, these are the SHA-1 hashes from my copy of Robot Odyssey:
    756a92e6647a105695ac61e374fd2e9edbe8d935  GAME.EXE
    692a9bb5caca7827eb933cc3e88efef4812b30c5  LAB.EXE
    360e983c090c95c99e39a7ebdb9d6649b537d75f  MENU2.EXE
    a6293df401a3d4b8b516aa6a832b9dd07f782a39  MENU.EXE
    12df28e9c3998714feaa81b99542687fc36f792f  PLAY.EXE
    bb7b45761d84ddbf0a9e561c3c3603c7f65fd36d  SETUP.EXE
    e4a1e59665595ef84fe7ff45474bcb62c382b68d  TUT.EXE
  • Something that can run DOS games with CGA graphics! This could be a PC booted into DOS, a Windows machine, DosBOX…

Before you apply the patch, make backup copies of all your game binaries:

micah@carrot:~/robot$ mkdir original
micah@carrot:~/robot$ cp *.EXE original/
micah@carrot:~/robot$ ls original/
GAME.EXE  LAB.EXE  MENU2.EXE  MENU.EXE  PLAY.EXE  SETUP.EXE  TUT.EXE

Now apply the patch to each binary. Each section of the game (Robotropolis, the Innovation Lab, and the Tutorials) have their own separate EXE file, each of which has a separate copy of the game engine. You can use the same or different settings for each.

For example, to patch all binaries with default frame rate, and with the keyboard patch enabled:

micah@carrot:~/robot$ python robot_odyssey_patcher.py original/GAME.EXE GAME.EXE -k
Found copy protection. Disabling...
Found blitter loop. Patching...
Found keyboard mapper. Patching...
Saving comment at 0x1a4d0
micah@carrot:~/robot$ python robot_odyssey_patcher.py original/TUT.EXE TUT.EXE -k
Copy protection not found.
Found blitter loop. Patching...
Found keyboard mapper. Patching...
Saving comment at 0x11380
micah@carrot:~/robot$ python robot_odyssey_patcher.py original/LAB.EXE LAB.EXE -k
Found copy protection. Disabling...
Found blitter loop. Patching...
Found keyboard mapper. Patching...
Saving comment at 0x152a0

Now run PLAY.EXE in Windows, DOSBox, etc. You should see the game running at a steady 8 FPS, and the non-numpad arrow keys should work.

Experiment with the options! The “-h” option gives you a full list of the available setitngs. For example, if you want the game to run a bit faster, you might add “-f -r 10“. This will run the game at 10 FPS, and speed it up when there’s keyboard input. Remember to add “-p” if you’re running in the Windows DOS emulation.

This patcher may also work for games other than Robot Odyssey, which were based on the same engine. For example, Gertrude’s Secrets and Rocky’s Boots. You may have to leave off the “-k” option, since these games don’t necessarily use the same keyboard mapping as Robot Odyssey.

Enjoy exploring Robotropolis!

Self-contained TED receiver

My previous entry introduced a homebrew receiver for the powerline-based data protocol used by The Energy Detective. I just designed a second revision of that receiver. This one is self-contained: It gets power and modulated data from a 9V AC wall-wart transformer, and decoded data leaves via an RS-232 serial port at 9600 baud. Best of all the circuit is very simple: Just an 8-pin microcontroller and a single op-amp.

Major changes in this version:

  • DC power for the circuit is now provided by the 9V AC input, instead of a separate power supply. Previously this would have caused unacceptable levels of harmonic distortion in the input signal. In the new design, this is mitigated by an inductor (which forms an LC low-pass filter), and by the lower power consumption of a single modern op-amp versus three ancient op-amps.
  • By using a simpler filter design and a modern op-amp with a gain-bandwidth product of at least 10 MHz, the bandpass filter and amplifier can be built using only a single op-amp.

Note that the MAX475 op-amp I’m using has been discontinued by the manufacturer, and it’s now hard to find. I just used it because I had one handy in my junk drawer. I’ll verify this design with other op-amps as soon as I can, but it should work with just about any op-amp which can operate on a single-ended 5V supply, and which has a high enough GBW.

Firmware and schematics (PNG and EAGLE formats) are in Subversion and more info on the theory of operation was presented in my previous blog entry.