At the Symbian Foundation, we run a number of open-source applications and utilities written both in-house and by our member companies to support the Foundation's goals of building and promoting the Symbian platform. We have servers running LAMP1 applications such as Symfony, Drupal, MediaWiki, Bugzilla and Mercurial. We also run some Tomcat applications such as OpenGrok and Bugzilla Metrics.
When the Foundation was set up we contracted a company to provide hosting services for us, which were provided on VMware ESX, on dedicated servers in a London datacenter. For various reasons, we decided some months ago it was time to part ways with our current hosting provider. This gave us the chance to make some moves we've been wanting to make for some time; a break from the now-unsupported PHP 5.2 to PHP 5.3 for our LAMP applications, a move to Ubuntu for the operating system, introducing some great configuration management and automation tools (more on those later), and removing some closed source components. It also had the side effect of allowing us to repurpose a rather large sum of money!
The goal in moving our infrastructure was to make the move seamless to the end user; I suspect that if you are reading this and you're an active user of Symbian's web applications, you couldn't even tell me what day this move was made. This project was undertaken with only minimal involvement from our web team, so there was no rewriting of the applications to use file storage in S3; we had to provide a system that worked the same as the previous hosting. This was my challenge, and it was a lot of fun.
What is AWS?
Amazon Web Services (AWS) is a cloud-based infrastructure platform, offering services such as computing capacity , database and storage, available on demand, priced as a utility, and controlled by web services.
People who like hard-to-pronounce acronyms2 have invented the new category of infrastructure as a service (IaaS) in order to separate AWS from other "cloud computing" offerings such as PaaS (think Google App Engine) or SaaS (Salesforce.com). Through a web service call, you say "I want you to give me a VM"; it returns you the ID of your VM, and starts charging you for it. The AWS offerings range from the low-level (VMs, disk, static IPs) to the high-level (message queues, MySQL databases)
Amazon invented this category and are the market leader; there are competitors, such as Rackspace Cloud and GoGrid.com, but after some initial experimentation, the decision to go with Amazon was easy to make.
Lock-in isn't too much of a concern when you're just renting infrastructure - you have root access, and you put it all there yourself, so you know you could pick it up and re-implement it anywhere else if you chose. (If we were using S3, it might be a concern.) However, you know you're the winner when you have an open source implementation of your API; the Eucalyptus project implements the AWS APIs and lets you run your own "private cloud" on your own hardware. Eucalyptus powers Canonical's Ubuntu Enterprise Cloud, which is a great way to get started without needing a credit card.
Does my application need the cloud?
If you are building an application from scratch, it's hard to say no to what cloud computing offers. You can scale from one machine all the way up to Farmville. Netcraft, while they're not busy confirming the death of things, claim that 365,000 web sites are hosted on EC2 as of May this year. It's also much better for your cashflow.
If you're migrating an existing application, your application doesn't run in the cloud today, so it obviously doesn't need what the cloud offers. However, rebuilding an application for a cloud environment (as opposed to just picking the disks up and putting them on virtualised hardware) makes you rebuild things in a way that should, if you do it right, make scaling easy.
Even if you don't need the scalability, think of the potential upsides; you can turn a machine on just to do some processing, and then turn it off again when you're finished.
Because we migrated an existing infrastructure, we aren't really using EC2 as anything more than a VPS provider. Albeit, a super-VPS provider where you can turn instances on and off yourself, clone and snapshot them, and scale them up and down your will. Now we have moved into the cloud, should our developers wish, they can use whatever other services take their fancy.
If you're running an application that runs 24x7, economically, cloud computing isn't necessarily going to be better, as Google's Vijay Gill commented recently. If your business relies on having thousands of servers, then you can probably benefit from the economies of scale of having your own setup, and if it's your core competency, you probably shouldn't outsource that. However, his figures don't consider the cost of reserved instances, which generally drop the per-unit cost of EC2 instances by at least half.
(I suspect Gill would suggest people with smaller needs use Google's PaaS offering, App Engine, which requires rewriting your application, and using either Python or Java. This series suggests that you don't have the ability or remit to do this.)
When I first drafted this article, I wrote "EC2 can't scale as low as Linode's $20/month plan": since then however, Amazon have announced "micro" instances, which cost next-to-nothing (when reserved, about $14/mo). The support is not exactly Fanatical - like many open source things, your support options are best-effort in the forums, or purchasing Premium Support, at 10/20% of your monthly spend.
However, if you believe, as we do, that cloud computing is the future, and you're running a few dozen servers, then even at non-reserved rates, EC2 probably works out cheaper than traditional commercial virtualised hosting, and offers you such benefits such as being able to use different, worldwide datacenters, at no extra cost.3
That there is no capital expenditure is a bonus for your finance department, but they will probably balk at having to pay by credit card. If your spend is enough, Amazon will put you on invoice billing; until then, fly under the radar, or convince your finance director that the savings are worth an exception to the policy.
A small note on open source vs proprietary software in the cloud
If your product costs more to use the more servers you run it on, cloud computing may not be for you. This reminds me of a post from Jeff Atwood; scaling out costs a lot more when you have to pay for your operating system or software licenses
(You can scale up instances - and if you have your root on EBS, you can do it with an existing machine - but the nearer you get to using a full unshared server, the closer Amazon have to charge you to the cost of a full unshared server, and so you may near the point where building your own server is a better deal.)
Over the next few weeks I will post a series of articles talking about the problems we faced, the challenges we overcame, and how you can take your application or servers and move them into the EC2 cloud. I will also touch on the things that are not-so-great about each component of the Amazon stack, and make suggestions to how you can work around them, or how Amazon could improve their service. You'll be the first to hear about the brand new file system we've developed for quickly moving files around EC2 instances. In return, I'd love to hear from anyone who has suggestions on how they think we could have done better.
To make sure you get each post as they come out, subscribe to this blog by RSS, or sign up for updates by e-mail.4 You should follow me on Twitter here, but only if you're prepared for short bursts of awesome.
- using the extended definition to include PHP, Perl and Python ↩
- technically, an initialism, unless you want to get quite rude ↩
- Things cost slightly more - around 1c/hour - outside of their largest centre, Washington DC. ↩
- If you want to make ultra-sure you only read posts relating to our Amazon migration, and don't accidentally read a band review or something, there's a category for that. ↩