Craig Box's journeys, stories and notes...

Archive for September, 2010

Clouded House: How we moved our service to Amazon EC2 without it even knowing, and how you can too

Wednesday, September 29th, 2010

At the Symbian Foundation, we run a number of open-source applications and utilities written both in-house and by our member companies to support the Foundation's goals of building and promoting the Symbian platform. We have servers running LAMP1 applications such as Symfony, Drupal, MediaWiki, Bugzilla and Mercurial.  We also run some Tomcat applications such as OpenGrok and Bugzilla Metrics.

When the Foundation was set up we contracted a company to provide hosting services for us, which were provided on VMware ESX, on dedicated servers in a London datacenter. For various reasons, we decided some months ago it was time to part ways with our current hosting provider. This gave us the chance to make some moves we've been wanting to make for some time; a break from the now-unsupported PHP 5.2 to PHP 5.3 for our LAMP applications, a move to Ubuntu for the operating system, introducing some great configuration management and automation tools (more on those later), and removing some closed source components. It also had the side effect of allowing us to repurpose a rather large sum of money!

The goal in moving our infrastructure was to make the move seamless to the end user; I suspect that if you are reading this and you're an active user of Symbian's web applications, you couldn't even tell me what day this move was made. This project was undertaken with only minimal involvement from our web team, so there was no rewriting of the applications to use file storage in S3; we had to provide a system that worked the same as the previous hosting. This was my challenge, and it was a lot of fun.

What is AWS?

In updating the About page for the UK AWS users group, I wrote:

Amazon Web Services (AWS) is a cloud-based infrastructure platform, offering services such as computing capacity , database and storage, available on demand, priced as a utility, and controlled by web services.

People who like hard-to-pronounce acronyms2 have invented the new category of infrastructure as a service (IaaS) in order to separate AWS from other "cloud computing" offerings such as PaaS (think Google App Engine) or SaaS ( Through a web service call, you say "I want you to give me a VM"; it returns you the ID of your VM, and starts charging you for it.  The AWS offerings range from the low-level (VMs, disk, static IPs) to the high-level (message queues, MySQL databases)

Amazon invented this category and are the market leader; there are competitors, such as Rackspace Cloud and, but after some initial experimentation, the decision to go with Amazon was easy to make.

Lock-in isn't too much of a concern when you're just renting infrastructure - you have root access, and you put it all there yourself, so you know you could pick it up and re-implement it anywhere else if you chose.  (If we were using S3, it might be a concern.) However, you know you're the winner when you have an open source implementation of your API; the Eucalyptus project implements the AWS APIs and lets you run your own "private cloud" on your own hardware. Eucalyptus powers Canonical's Ubuntu Enterprise Cloud, which is a great way to get started without needing a credit card.

Does my application need the cloud?

If you are building an application from scratch, it's hard to say no to what cloud computing offers. You can scale from one machine all the way up to Farmville. Netcraft, while they're not busy confirming the death of things, claim that 365,000 web sites are hosted on EC2 as of May this year. It's also much better for your cashflow.

If you're migrating an existing application, your application doesn't run in the cloud today, so it obviously doesn't need what the cloud offers. However, rebuilding an application for a cloud environment (as opposed to just picking the disks up and putting them on virtualised hardware) makes you rebuild things in a way that should, if you do it right, make scaling easy.

Even if you don't need the scalability, think of the potential upsides; you can turn a machine on just to do some processing, and then turn it off again when you're finished.

Because we migrated an existing infrastructure, we aren't really using EC2 as anything more than a VPS provider. Albeit, a super-VPS provider where you can turn instances on and off yourself, clone and snapshot them, and scale them up and down your will. Now we have moved into the cloud, should our developers wish, they can use whatever other services take their fancy.


If you're running an application that runs 24x7, economically, cloud computing isn't necessarily going to be better, as Google's Vijay Gill commented recently. If your business relies on having thousands of servers, then you can probably benefit from the economies of scale of having your own setup, and if it's your core competency, you probably shouldn't outsource that. However, his figures don't consider the cost of reserved instances, which generally drop the per-unit cost of EC2 instances by at least half.

(I suspect Gill would suggest people with smaller needs use Google's PaaS offering, App Engine, which requires rewriting your application, and using either Python or Java. This series suggests that you don't have the ability or remit to do this.)

When I first drafted this article, I wrote "EC2 can't scale as low as Linode's $20/month plan": since then however, Amazon have announced "micro" instances, which cost next-to-nothing (when reserved, about $14/mo). The support is not exactly Fanatical - like many open source things, your support options are best-effort in the forums, or purchasing Premium Support, at 10/20% of your monthly spend.

However, if you believe, as we do, that cloud computing is the future, and you're running a few dozen servers, then even at non-reserved rates, EC2 probably works out cheaper than traditional commercial virtualised hosting, and offers you such benefits such as being able to use different, worldwide datacenters, at no extra cost.3

That there is no capital expenditure is a bonus for your finance department, but they will probably balk at having to pay by credit card. If your spend is enough, Amazon will put you on invoice billing; until then, fly under the radar, or convince your finance director that the savings are worth an exception to the policy.

A small note on open source vs proprietary software in the cloud

If your product costs more to use the more servers you run it on, cloud computing may not be for you. This reminds me of a post from Jeff Atwood; scaling out costs a lot more when you have to pay for your operating system or software licenses

(You can scale up instances - and if you have your root on EBS, you can do it with an existing machine - but the nearer you get to using a full unshared server, the closer Amazon have to charge you to the cost of a full unshared server, and so you may near the point where building your own server is a better deal.)

Stay tuned

Over the next few weeks I will post a series of articles talking about the problems we faced, the challenges we overcame, and how you can take your application or servers and move them into the EC2 cloud.  I will also touch on the things that are not-so-great about each component of the Amazon stack, and make suggestions to how you can work around them, or how Amazon could improve their service.  You'll be the first to hear about the brand new file system we've developed for quickly moving files around EC2 instances. In return, I'd love to hear from anyone who has suggestions on how they think we could have done better.

To make sure you get each post as they come out, subscribe to this blog by RSS, or sign up for updates by e-mail.4 You should follow me on Twitter here, but only if you're prepared for short bursts of awesome.

  1. using the extended definition to include PHP, Perl and Python 
  2. technically, an initialism, unless you want to get quite rude 
  3. Things cost slightly more - around 1c/hour - outside of their largest centre, Washington DC. 
  4. If you want to make ultra-sure you only read posts relating to our Amazon migration, and don't accidentally read a band review or something, there's a category for that

Review: Barenaked Ladies at Hammersmith Apollo, 15 September 2010

Thursday, September 16th, 2010
Barenaked Ladies

For a large period of my life - pretty much from the moment I first saw them live in Auckland in '991 - Barenaked Ladies were my favourite band. I can't put my finger on when or why they fell from that position – possibly in the quiet period between the release of their somewhat lacklustre Everything to Everyone and the double-CD-in-two-parts that was Barenaked Ladies are Me/Men in 2006/2007 – or even who replaced them; they just wandered out of my playlist, in the way that bands sometimes do. However, they are still undisputedly the best live band I've ever seen (only Green Day have ever come close), and I will jump at the chance to see them play.

The elephant in the room2 is the lack of singer/songwriter Steven Page, who departed the band 18 months ago. Last time we saw the band was in their home town of Toronto in December 2008, two months before the split; I'm glad I got to that show, even if it was a bit Christmas-carol heavy. Since then, after a long break, BNL have returned with All In Good Time, a record firmly at the grown-up end of the spectrum (as you might expect from a band whose last studio output was a children's album called Snacktime).

Last night they bought their All In Good Time tour to the Hammersmith Apollo in London, officially making Barenaked Ladies the third band I have now seen in three different countries.3

While as expected there were a number of cuts from both their new album and their hit album Stunt (you know, the one with the "Chikkity China" song on it), the set had a number of tracks from its follow up Maroon. Older songs were fewer - Old Apartment was the first song where Ed Robertson took Steve's lead vocal, and it's worth noting that drummer Tyler Stewart makes a surprisingly good Ed to Ed's Steve.

Strangely the song where I missed Steve the most was one he didn't even sing - the first single of the new album, You Run Away, which is ostensibly about the circumstances surrounding Steve leaving the band he'd been in for 20 years. The end chorus on the record relies on Ed's double-tracked vocals, and the live version was just crying out for a proper duet.

Barenaked Ladies

However, the new arrangement worked brilliantly in other places. I'm the first to admit that the voice of Kevin Hearn (a taller, thinner Turtle from Entourage) isn't generally to my taste. Sound Of Your Voice from Barenaked Ladies Are Me is my favourite BNL song since the Maroon days; Kevin wrote this song, but Steve sings it on the album. Since Steve left, Kevin has taken lead vocal again, and even if Kevin was a more powerful vocalist, Page's are very big shoes to fill, especially on a belter such as this. The current arrangement features Kevin playing acoustic guitar, and the other three clustered around a microphone singing the backing vocals in a "doo-wop" style, complete with clicking fingers and synchronized swaying. It was suitably different and it bought new life back to a great song.

Many of the rest of the new songs were treated as bathroom breakers; by comparison, It's All Been Done almost got some of audience jumping up and down where the seats should be. The loudest cheers of the evening were reserved for mentions of all things Canadian, no doubt by the gentleman (and ladies!) all in hockey garb.

A trademark of the BNL experience was the live improvised song/rap, which tonight was about English accents and a kid telling Ed not to steal his bike. (Trust me, they're better in person.) The improvised story in the middle of If I Had A Million Dollars told us that in lieu of him being able to find a park to run around, he was doing laps of the Shepherd's Bush Green, and the descriptive story of his hotel (with "snow room" - more Canadian cheering!) no doubt led some of the more intent fans to camp out in front of it the next morning!

More BNL trademarks include the throwing of underwear (placed on guitars) and Kraft Dinner (thrown from the balcony, and hitting everyone in our vicinity!).  Did the "those in the know don't throw" message not get through to those who were super-fan enough to still go through with this?

One of my favourite parts of the old BNL show was the post-Million Dollars medley, a carefully crafted pastiche of current pop songs. I fondly remember the Auckland show, seeing Tyler run up and down the stage singing "Near" and "Far" in the style of Sesame Street, then Steve breaking into "Near, far, wherever you are" from My Heart Will Go On (this was 1999!). From bootlegs I've heard, this was dropped from the set around 2000, but seems to have come back with a vengeance; Kevin's slow start on Oh It's Magic soon joined by a beatboxing Ed and turning into I Got A Feeling by the Black Eyed, Peas, Baby by Justin Bieber and California Gurls by Katy Perry, ending in a triumphant Tyler-carrying-Jim moment.

The encore started great, with Tyler-the-drummer down the front and Ed on drums; it not being Christmas, so his regular Feliz Navidad seemed a bit unlikely: instead, we were treated to a madcap version of Alcohol.4 It all got a little muted from that point on; the new Kevin-lead ballad Watching the Northern Lights was treated as another bathroom break, and the finisher Tonight Is The Night I Fell Asleep At The Wheel is a great song, but one I figured Steve would get in the divorce. The new lineup seems freed from the obligation to finish with Brian Wilson every night, although examinations of other setlists suggests they do still play it. (This is not a game I should play, because examinations of other setlists simply make me jealous that they played songs I like on other nights!)

While it's not the same band any more, it's still fun, and I disagree with comments that suggest it's butchering the memory. No more so than having BNL featuring Thin Steve with stylish glasses.

Update: Check out some professional photos of the evening at IES Photography.

  1. I think it was '99; strangely enough I can't find confirmation on the Internet at all! 
  2. On the subject of elephants in rooms, I gained another chin eating a diet of hot dogs and poutine in my two years in Canada; I don't mean to generalise, but the room at BNL, on average, appeared to weigh a lot more than other, primarily-English audiences I've been in there. Just sayin'. Eat healthy. 
  3. Behind R.E.M. and Crowded House; technically I could include Neil and Tim Finn solo, but I won't. 
  4. Bass player Jim Creegan sings lead vocals on one or two songs an album nowadays, and takes live lead on some Steve songs, but didn't get one tonight. 

OpenID and MindTouch

Wednesday, September 15th, 2010

The below is all outdated - links updated to for historical interest.

When I was working at Coreworx in Canada I introduced an internal knowledge base in the form of MindTouch (then DekiWiki).  I got involved in the community around the project, and ended up meeting the developers behind it in San Diego.

I evaluated it for a project at Symbian, and in order to try and make it suit us better I wrote a module for using OpenID.

Today, MindTouch have published a couple of posts I wrote on the subject, which I am happy to share with you here:

In other news, over the course of the next few weeks, I'll be running a series on the migration of Symbian's LAMP/LTMJ (unfortunately Linux/Tomcat/MySQL/Java doesn't have a vowel in it) hosting servers to Amazon EC2. Stay tuned!