1. icon for News
  2. icon for TCHOW


Sunday, August 26, 2012

Updated Backend

Over the last few days, I've massively restructured the way that tchow.com is served.On the surface, you shouldn't notice any big differences, but under the hood things have gotten dramatically simpler.

How it was

From sometime around 2007 until just a few days ago, tchow.com was hosted on a VPS partition from RapidVPS.The partition was set up running a pretty standard stack: Gentoo GNU/Linux, the Apache web server, and PHP5.
I designed my content serving system to be future-proof, simple to use, and to contain minimal redundancy.This meant that I wanted visitors to see clean URLs (no file extensions; file types determined by Content-Type header), and those that looked at the markup to see nice xhtml with a clear separation of content and chrome.Internally, I wanted to be able to serve the web page from a filesystem, provide content with my favorite text editor (and/or scp), and do so in a not-too-idiosyncratic format.
In order to make this work, the following happened on each page load:first, mod_rewrite rules would take the user's URL and turn it into a query string for a dispatch.php script;this script, in turn, would crawl the backend directory structure to find the proper page (a file containing an html fragment),then paste in appropriate templates for navigation and analytics.

What I didn't like about it

This old setup was nice, but it was also overkill -- php was re-generating the same (static) content over and over again, opening and reading through tens of files on each page view.This seems really inefficient (and, honestly, the "standard" answer of wrapping this in a caching web server seems even sillier).
Additionally, external URLs were clean-ish, but I never really resolved where a trailing slash was appropriate, which led almost everything having absolute links all the time.
Also, I ended up storing my page content in an svn repository (to move edits between my staging and live pages), but since svn has no way of updating all of htdocs at once, this creates potential race conditions in page viewing during updates.Besides, I've been using git for a number of years, and it feels so much snappier than svn that it was getting to be a drag to do page updates.
Added to these design concerns, I'd noticed that Digital Ocean was offering a VPS partition of similar size to my RapidVPS partition (but with unlimited bandwidth) for about half the price.So it was time to change.

How it works now

First off, I ditched Apache -- which provided way more than I need -- and switched to Lighttpd; this also provides way more than I need, but at root it's a lot more comprehensible.Indeed, I get nice clean URLs using just two modules: mod_rewrite for hostname correction (e.g. so that www.tchow.net redirects here properly) and mod_magnet for URL dispatch.I no longer have php re-generating static content over and over again;instead, I have a python script that I call on the (git) content repository which regenerates just those pages that need it.
But mod_magnet is really the key to the whole setup because of the way it works:you provide Lighttpd with the path to a lua script, and it runs that script to determine which physical path corresponds to given query string (the script can also do things like trigger redirects and add headers).The cool thing here is that now every page view goes through dispatch.lua (in my case, nothing more than a url -> file mapping table), and it's one file, which means that if I atomically update it then I've atomically updated the web page.So this setup gives me the satisfaction of being able to update the content on tchow all at once, and without race conditions (as long as my scripts are careful to not overwrite any content referenced by the active dispatch.lua).
I also re-worked a few things around the urls so that the semantics are much clearer, changed the styling a smidge, drew some new calendar icons, and did various other tidying.But mostly, I'm satisfied that I've gotten rid of a lot of systems that tchow.com didn't need, and pared things down to a really slick-n-slim software stack that still satisfyingly serves what it should.