Managing sites with Git and intelligent post-update hooks

I’ve recently begun drinking the koolaid of Git, and damn it’s tasty! The things I can do with git that I couldn’t have done before (or would have been difficult to do) makes me excited about it. In fact, the one feature that I thought was a drawback — the no-one-true-server nature of it — is actually its strongest selling point.

See, the way I’ve taken to doing my development now is I create two “remote” repositories. First is “origin” which points to a repository managed by Gitosis. Second is a “live” repository that points at a working directory on my production server. That working directory is where my live site actually runs in.

On its own this is handy. As I develop new features, I push my changes to origin. Once my code is ready, and after I run “make test” to verify my site passes all its unit tests, I push to my live repo. At this point I ssh to the server, restart my server processes, and in theory all should be well.

The need for automation

I discovered quickly that in practice this was fraught with error. After a fairly large refactor, I found that the code that worked perfectly well on my development laptop fell over on production. Old or missing libraries, dependancy problems, ownership permissions on files, you name it. My site was down for 2 hours while I tried to resolve these issues. I added more unit tests to my code, but still this wouldn’t have caught these problems.

At this point I decided that a more automated approach was needed. I used a friend’s post-update hook as a template, which simply merged in changes and restarted nginx following a push to live. To this I added a long one-liner, and added the relevant commands to /etc/sudoers with the “nopasswd” option. In the end, the function looks like this:

Essentially, after my new changes are merged in, it:

  1. Creates and runs my Makefile
  2. Runs all my unit tests
  3. Tests my new nginx server configuration
  4. Reloads my new nginx configuration
  5. Restarts my FastCGI web application

If any of those steps fails, the full process halts. All the >&2 arguments ensure that the output of these commands are echoed to my local console from the remote server. So when I type “git push live”, all the test output is displayed to me inline. If an error occurs, I can immediately fix it in my local environment and push out a new change without having to log in to my remote server once.

My web application is written in Catalyst, and I use Test::WWW::Mechanize::Catalyst extensively, so I not only unit test my back-end classes, but I test the URLs users actually interact with. It even goes so far as creating and destroying test accounts within my database, so every feature of the website is tested, right down to validating the contents of my robots.txt and sitemap.xml files.

Next Steps

There are still some holes that need to be filled here.

  • If running my tests fail for some reason, I would like to roll-back the remote working directory to the previous version and restart my services, that way the site continues to function under a known-good state.
  • I would like to use WWW::Google::SiteMap::Ping to notify Google, and perhaps other search engines that support XML Sitemaps, that the contents of my site have changed and a reindexing is needed.
  • My site is localized, so I would like to regenerate my PO translation files, and if any strings have changed or are out of date, automatically send an email to my translators with the new POT template file attached.
  • Run my HTML through a spelling checker, to verify I don’t put any typos in any of my pages.
  • Since I try to use caching as much as possible, when a web page’s content has changed, I would like to automatically connect to my Memcached servers to purge the relevant pages from its cache so new versions are immediately available.

A fringe benefit is makes it fun and exciting to write unit tests! Whenever I find an area that isn’t covered, I sit down and crank out more tests to validate new features. I never want to be caught with my pants down on a deployment. Each time it catches something I missed, it makes the whole thing worthwhile.

Does anyone have any thoughts on how I can improve this? Is anyone doing something similar that I can learn from? I’m very excited about my new deployment system, and wish I’d had this ages ago. If there’s anything you want to know more about, please leave a comment.

Reblog this post [with Zemanta]

Tags:

About Michael Nachbaur

iOS app developer, livin' the dream. Working from wherever I find myself; Hawaii, Santa Monica, Vancouver, and elsewhere.

One Response to “Managing sites with Git and intelligent post-update hooks”

  1. Patrick Chkoreff December 25, 2009 7:20 am
    #

    Awesome article Mike! I love your attitude toward automation and testing, and this is just the sort of thing I strive for. I just recently started using “git” to manage my code, and I’ve been thinking about the best way to integrate it into my live server update process.

    Formerly for version control I was using good old “cp -pr”, “diff -r”, “diff -qr”, with “rsync” for publication. :) I was very disciplined with it and it served me well. But I agree that “git” represents a complete change of paradigm and I’m eager to use it more.

    I’ll let you know how it goes and give you feedback. (By the way my “Loom” project is at http://github.com/chkoreff/Loom — it’s a self-contained digital cash server written in Perl.)

    P.S. Thanks to duckduckgo.com for leading me to your article.