Category Archives: CIA

cia-4x3 service is down indefinitely

For those who haven’t seen it: is a web service I started about 8 years ago, as a way of Open Source projects tracking code changes in real-time over the web and IRC.

This is probably going to come as a surprise to most of you reading this, but today I’ve taken the service down indefinitely.

Update 2011-05-11: Wow, that was quick. Merely hours after this announcement, CIA is already in the process of being transferred to new ownership. The team at have volunteered to take on maintenance, hosting, and future development, and the future of CIA is actually looking pretty bright right now! I’ll have more updates later.

Update 2011-05-12: As of this morning the service is back up and running, now in the hands of the team. They’ll be ironing out wrinkles for a while now I’m sure, but the service is up and your existing accounts and bots should resume functioning. Congrats on the fast work, guys!

CIA had a good run, and I’ve really enjoyed the opportunity to create a project that the open source community as a whole received some benefit from. When I created the first CIA bot in May of 2003, it was just a quick hack for a single project I worked on, and I had no idea it would take off so quickly. Since then, it’s been rewritten a few times, grown massively, and become almost iconic in some circles.

However, a lot has changed in my life in the mean time. When I started the project, I was a college student with plenty of free time. Since then, I’ve had amazing amounts of Real Life occur, and my time for ongoing maintenance has dwindled. I am in debt to Karsten Behrmann for taking over a baseline level of maintenance for the last couple years, but unfortunately the project has stagnated. I frankly never wanted to manage a large web service, and as soon as CIA reached the scale it’s at now, I was hoping it would be taken over by a community of volunteers, or by a corporation. We’ve had some help from volunteers, and has always been interested in the project. But there’s never been a critical mass of involvement. When I was working on CIA, I was pulling most of the weight, and since then it’s really just been on life support.

And as much as I loved the project when it was new and fresh, I hate to see it limp along like this. I’ve asked many people if they would want to help improve the service, and it’s really hard to find motivated individuals when the current service is good enough for most people’s needs.

So, I’ve decided to take the server down. There are many other projects which overlap somewhat with CIA’s feature set, so it’s possible that CIA’s users will find these projects sufficient and that will be the end. Or perhaps a new project will spring up to fill the vacuum. Or perhaps one of you will want to continue where I left off— improve the current service’s codebase, and build a real community around it.

Unfortunately, I really can’t be a part of this process. I have too much else going on in my life, and my interest in the project has long since been drained.

In the event that one of you are indeed serious about re-starting CIA with an active community, I’m preserving the database from the current server. I will be willing to transfer that, as well as the rights to the domain name, to a new owner only in the event that they’ve created a community of administrators and developers who can act as sustainable caretakers for the project. Additionally I have a friend who’s volunteered hardware and bandwidth to host such a site.

But I’m really hoping the community will re-form to fit the current state of the Internet. A lot has changed since 2003, especially in the area of web services for open source development. We have Ohloh, Github, and Twitter. It’s a new world, and we need new tools.


Map of CIA’s Architecture:

Thank you.

New Blog for

Well, I have finally joined the ranks of those who have multiple blogs.

All CIA-related posts will be made on my new CIA Blog.

As a side note, I’ve been quite happy with Django for the new CIA web site. It makes the easy things easy and still manages to get out of my way when I need it to.

I was a little disappointed not to find a decent ready-made blogging app I could drop into CIA, but I’m pretty sure this is because creating a blog in Django is almost entirely an exercise in writing HTML templates and integrating it with the rest of your site. CIA’s blog turned out to be about 400 lines of Python. Most of this is for integrating the blog with both reStructuredText and CIA’s existing AJAX image uploading system.

CIA sure has grown

The latest stats from CIA‘s bot server:

50 total IRC bots across 33 networks, inhabiting 370 channels with a total of 5782 users.

And amazingly enough, all the large Freenode bots have uptimes of 2 weeks or so. That fancy flood-kick-prevention code sure did the trick.

January 2006 Update


It’s easy for me to see Fyre as a ‘dead’ project these days- the existing codebase was declared ‘done’, and we moved on to a new architecture dubbed Fyre 2.0. In some ways this rewrite has suffered from second system effect, and the code hasn’t been touched in a while. But, I think we all agreed this was necessary. Fyre 1.0 really was a great release, and it really did everything we felt it should do considering the project’s scope.

Anyway, it’s really heartwarming when I come across an enthusiastic user of Fyre on the interweb, who’s posted some renderings of their own. It’s even cooler to find someone who’s implemented their own Distributed Chaotic Image Renderer after being inspired by Fyre.

Now that Firefox 1.5 is commonplace, I might as well mention my old Javascript De Jong renderer. It’s a great example of why the <canvas> element is largely orthogonal to SVG.

Home Theater

My traditional method of playing DVDs at home would be to boot up my HTPC and play them via Freevo. In concept this was great, but the picture quality left much to be desired- lots of horizontal ‘wiggle’ due to noisy sync signals from my video card’s component out, and the video was getting scaled at least twice.

Yesterday, while wandering the shopping megaplex that is Santana Row, I finally picked up a hardware DVD player: An LG LDA-511, on sale for $110 after rebate. I’m just blown away by how much hardware you get for that price these days: slot loading drive, HDMI output, upscaling to any HDTV resolution, DivX/MPEG4/JPEG/MP3/WMA playback, memory card slots… The picture quality is great, and it will even play a lot of music and video directly off of my existing DVD-R discs. My one complaint so far is that the memory card reader is excruciatingly slow. It’s a good thing I doubt I’ll ever have a use for that feature.


I came across the Django project recently. This is an incredibly spiffy mod_python-friendly web framework, which includes a template system and a rather cute object-relational mapper. Since I’ve been wanting to get CIA‘s web frontend away from Twisted and onto mod_python for a while now, this looks like a great opportunity to simultaneously do away with Nouvelle.

So, I’ve started a branch for what’s sure to be a heavy-handed hack and slash partial rewrite. The biggest issue with splitting the HTTP frontend into multiple mod_python processes, however, is the ruleset storage. Currently, CIA just loads all rulesets from the database on startup, compiles them a bit, then scans them linearly every time a message is delivered. This is both RAM-hungry (about 8MB) and very CPU inefficient. My current strategy is to actually compile the rulesets into an SQL table such that the slowest and least scalable aspects of message filtering can all be done efficiently within the SQL server.

Upgrading Zero and Flapjack

For a while now, I’ve had two virtual servers hosted by Linode: Zero and Flapjack. Zero is a smaller server that’s become a bit of a dumping ground for personal web sites, source code, and photos. Most of the content you see on the domain comes from Zero. Flapjack is a larger and somewhat more restricted server that almost exclusively exists to run CIA. It just happens to have some extra disk space that we use for the occasional static file.

Linode is very good at what it is designed to do. It provides a friendly and very easy to configure Linux server for a reasonable price. It was never intended for performance or for capacity, however. Zero has been low on hard disk space almost since day one, and Flapjack has been nothing but performance bottlenecks. CIA could always use more RAM and CPU, but the biggest source of frustration has been Linode’s policy of throttling disk I/O. Their uptime also leaves much to be desired.

I’ve been long overdue for a dedicated server. After spending another evening shopping around, I finally settled on a 2.8 GHz P4 with 80GB of disk and 1GB of RAM from ServerMatrix. They seem to have a great reputation in the webmaster community, and their prices are good. The AUP isn’t perfect, but it’s livable.

I’m tempted to try out a little of my employer’s snake oil, and use VMware GSX to divy up this server. I’d like to keep Zero and Flapjack separate for fault isolation reasons, and it’d be great to have spare server images around in order to make upgrades and backups that much smoother. I’d love to be able to clone a production server in order to test experimental changes quickly.

Anyway, this is the first step toward a lot more headroom and performance for CIA, and for everything else on Hopefully I won’t go completely mad from the associated migration headaches- we’ll have to wait and see.

Memes to spread


Some more crazy ideas for CIA‘s future…

  • “Sparklines.” A pretty novel (but not new) way to present data history inline with text. I’m already building up some graphing infrastructure for CIA around my ‘fidtool’ library. Just imagine how cool it would be to, at a glance, see the activity history for all the projects on the screen.
  • Wiki integration. The whole “documentation” section of the site is really just a read-only wiki at this point. It generates those pages from reStructuredText documents stored with CIA’s source code. If the documentation browser were promoted to a full wiki, users could maintain their own client scripts, installation instructions, and such without having to bother me 😉

But actually, those are the most tame ideas that have been circulating. Some even more outlandish ones:

  • Why not build the entire CIA site on top of an existing wiki engine? It would be great to have a wiki-like way for anyone to edit project/author metadata, but with some form of version control to allow rollback in the case of abuse. Really, I could take nearly any wiki engine out there and give it a special namespace for stats targets.
  • I’ve been searching for a new way to organize the real meat of CIA’s site: the stats browser. Each stats target (one project/author/host/vcs/etc.) can have several types of content attached to it. It can have user-assigned metadata, automatically generated ‘related’ links, the recent commits, and a list of children. The big problem with this is that the larger pages (project, author, gnome) just have too much content to display all at once. For any target, large or small, I really want to see the most recent information first. This suggests somehow merging everything into a single chronological list. I’m not entirely sure how this will work yet, but I’d like to focus on the commit list but attach information about related/child items as appropriate. An important part of this would also be allowing the user to choose where to display additional information. This will probably mean “More…” links at the bottom of the page, plus some way of interactively expanding the inline information attached to each commit.


Amanda found a really spiffy web service: It’s a streaming music server with an intriguing non-genre-based method of categorizing music. You put in a handful of artists or songs you like, and it tries to stream similar music for you. It actually seems to be working pretty well so far.

Turkey Day

Thanksgiving this year was great- I spent most of my time back in Colorado with my family, whom I’ve spent far too little time with recently. I baked an apple pie with my brother’s help, ate far too many of my grandmother’s homemade rolls, and really just got some much needed time to catch up. Much of the Boulder crowd was busy with other things, but I was thankful for the time I was able to spend with David and Jen.

I’m not sure when I’ll be coming back next. I’m trying to conserve both my money and my vacation time at this point. I’d like to do something for New Years’ Eve, but any plans for that are still in their early stages.

CIA Updates, Cappucino in The City

Database v7

Well, it’s been a busy past couple days, but I have some great news to report with regards to CIA. The new database backend I’ve been hacking at for the past couple weeks is now in place, and the performance is great.

So, to back up a bit… CIA uses MySQL. It has for almost 2 years now. There are many things CIA does that SQL works great for: storing the smaller bits of metadata on each project/author, tracking the association graphs, and quickly pulling up overview statistics for the site’s front page. SQL does not, however, work well for storing the actual commit messages. I’ll save you the boring details of why, but suffice it to say that this has been bogging CIA down in nearly every way possible. The stats_messages table was eating about 2.5 GB of disk space, the indices were causing mysqld to use most of my RAM, it was using unacceptable amounts of my server’s limited I/O bandwidth, and there was no efficient way to clean out old messages. CIA needed a new storage option.

The new v7 database schema CIA is on now has completely abandoned the stats_messages table. I developed a custom file format for the messages. Each stats target has a file on disk with two ring buffers. A fixed size (currently 4 kB) index buffer provides IDs for recent messages. These IDs are actually offsets into a second larger ring buffer. This second buffer starts out empty, but can grow up to a fixed maximum, currently 256 kB. At the seek offset corresponding to each valid message ID, the content buffer contains a frame marker that lets it detect invalid or expired IDs. The messages themselves, rather than being stored as normal XML, are encoded as a binary representation of SAX events. This makes them quick to load, and lets the format maintain commonly used strings in a dictionary. This provides about the same compression ratio as gzip’ed XML, but it doesn’t have the backup problems that an already-compressed database would present.

I spent Saturday night (in a hack session with Christian and Paul) finalizing the database backend itself and writing a converter tool. Sunday, the converter ran.. and it ran.. and it eventually finished Monday morning after converting over 1.5 million commit messages! After doing a dump/load to shrink the InnoDB tablespace, I had successfully replaced a 2.5 GB impossibly slow SQL table with a zippy collection of small ringbuffer files, totalling under 500 MB. CPU usage with the new backend is about the same, but I/O load is much lower.

Google AdSense

If you look way at the bottom-left of any CIA page, you’ll now see a familiar block of text advertisements. Consider this a trial run. CIA’s web layout is currently such that there really isn’t a great place for advertisements, so they’re really kind of in the middle of nowhere at the moment.

Yes, this marks CIA’s descent into the depths of capitalistic hell I’m sure. The reality is that CIA costs a lot to run. I’ve been paying $40/month for its current hosting. We’ve had a lot of great donations which I am really greatful for. Now that I’m out of college and have a real job, I can pay for the server without begging for $10 here and $5 there.

But yet, the ads are an experiment. CIA receives a huge amount of web traffic. Granted most of this is generated by RSS aggregators and web spiders, but even the comparatively small human-generated slice of the pie is pretty big. If it’s possible to support CIA’s hosting just by adding some unobtrusive text ads, I think this would be beneficial for everybody involved.

Ideally, once I have time to tweak CIA’s layout, there will be a better place for the ads. If they do well enough, I might even be able to get a real dedicated server rather than this Linode box. At the rate CIA is growing, I’m going to need one sooner or later.

I want to emphasize that I will only be placing text ads on the site, and only where they won’t interfere with the display of the site’s real content. Site usability is still my #1 concern, but I need to plan for a future that includes steadily rising server requirements.

Future Work

So, what’s next on the plate for CIA?

Well, database v7 is actually just a small piece of a larger database overhaul I have planned for CIA. I’d like to eventually get rid of MySQL. As nice as MySQL is, I’m really not benefitting from the use of a client/server database. My utopian database v8 format would look something like this:

  • Messages themselves stored as an indexed ringbuffer of SAX events, same as v7
  • Smaller metadata items stored either in SQLite or a very simple special-purpose format
  • Larger metadata items (like images) stored in individual files
  • Counters replaced with fidtool. Fast Interval Databases (FIDs) are another backend format I’ve been working on. They store event timestamps, and efficiently perform interval queries against them. This would let me finally replace the “today”, “yesterday”, “this week”, and such with sliding intervals. For example, with fidtool CIA could keep a running total of commits received in the last 24 hours. I also have some cool plans for activity graphs with google-maps-like AJAX scrolling and zooming.
  • Catalog data cached in SQLite. This should eliminate the uber-queries that are currently necessary to index all the children of a particular stats target. This should finally let me re-enable the project and author index pages.

Well, except for a few new capabilities database v8 will offer, it’s mostly optimization work. Since database v7 seems to have the server running smoothly again, there are a lot of other projects I’d like to poke at:

  • A web frontend for maintaining rulesets and requesting capabilities (finally!)
  • Replacing metadata keys, which exist only for historical reasons really, with a traditional username/password login
  • Further splitting the monolithic server process. I’d like to eventually be able to run the web frontend and XML-RPC server as separate processes, and even to run them directly from apache using mod_python. This should improve server response times, and potentially make maintenance easier.
  • Web site layout tweaks. CIA has grown up a lot since the current layout was designed, it would be nice to revisit it to improve usability. Particularly, I’d like to make it scale better to large numbers of projects, authors, and stored messages
  • I’ve toyed with the idea of replacing CIA’s little documentation system with an external wiki. As it is, the doc browser is really a lot like a read-only wiki in the way it formats documents. With a real wiki, users could more easily contribute their own client scripts, tips and tricks, suggestions, complaints, and documentation.
  • On a similar note, would it be possible to ditch the majority of CIA’s web frontend and somehow embed it into a wiki or some other content framework?
  • Web 2.0 tags, ‘cuz they’re trendy. To really make CIA web-2.0-compliant, I need a page that lists all projects where font size is proportional to activity.

I’m sure I’ve forgotten a few things.. I’ve had a lot of crazy ideas swimming around in my head recently. If I’m not careful, a few of them might get implemented.


Thursday night, David flew into town for a job interview with VMware. We don’t know yet whether he got the job, but we’re quite hopeful. Either way, we managed to finish off his day-long interview with the traditional video games and dinner. More video games at my apartment later that night, then Saturday morning we wandered San Francisco a bit with Scott and Carrie. We tried a couple coffee shops in North Beach, and I ended up with some muchly yummy cappucino and biscotti.

His flight back home from SJC was at 5:00 Saturday, so we didn’t have a huge amount of time to spend here. Still, it was a lot more exploration than I got to do when I flew in for my interview earlier this year. It was goodbye for only a week or so. Wednesday I’m coming back to Colorado for thanksgiving, and I plan on spending the weekend in Boulder. Yay!