Off the record - Craig Box

Clustering an Amazon Elastic IP address

October 27th, 2010

If you have a problem that Amazon's Elastic Load Balancing can't solve, you might want to do the old fashioned "two machine IP failover" cluster.

Amazon instances only have one internal, and one external, IP address at a time. Consider this:

Instance 1: 256.256.256.4 [Elastic IP]
Instance 2: 257.257.257.8

If you claim the elastic IP on instance 2, then a new IP will be allocated to instance 1:

Instance 1: ¿?
Instance 2: 256.256.256.4 [Elastic IP]

You won't know what it is unless you query the web services, or look at the console, for instance 1. Be sure you are aware of the implications of this before proceeding.

I found a forum post from Alex Polvi which, with some tidying, does the job nicely. When the slave node realises that its master mate has gone offline, it will claim the IP address; when the master returns, you can have the master claim it back, or you can have the slave just become the new master.

Claiming the shared/elastic IP

Your script needs a command that the master machine can call to claim the elastic IP address. Alex's example uses Tim Kay's 'aws' script, which doesn't require Java like the official Amazon ec2-utils.

You need /root/.awssecret to contain the Access Key ID on the first line and the Secret Access Key on the second line:

AK47QWERTY7890ASDFG0H
01mM4Rkl4RmArkLArmaRK14rM4rkL4MarKLar

You can now test this:

$ export AWS_PARAMS="--region=eu-west-1"
$ export ELASTIC_IP=256.256.256.4
$ export MY_ID=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
$ aws $AWS_PARAMS associate-address "$ELASTIC_IP" -i "$MY_ID"

The MY_ID command uses the instance data service to get the instance ID for the machine you're running on, so you can use this script, unedited, on both machines.

This should claim the IP 256.256.256.4 for the instance on which the script is run.

In order for Heartbeat to be able to use this script, we need a simple init script. When run with 'start' it should claim the IP, and when run with 'stop' it should relinquish it. You will need to edit the parameters at the top (or better yet, put them in /etc/default/elastic-ip and source that in your file). Remember to ensure this script is executable.

/etc/init.d/elastic-ip

#!/bin/bash
DESC="elastic-ip remapper"
MY_ID=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
ELASTIC_IP="256.256.256.4"
AWS_PARAMS="--region=eu-west-1"

if ! [ -f ~/.awssecret ] && ! [ -f /root/.awssecret ]; then
    echo "$DESC: cannot find ~/.awssecret or /root/.awssecret"
    exit 1
fi

case $1 in
    start)
        aws $AWS_PARAMS associate-address "$ELASTIC_IP" -i "$MY_ID" > /dev/null
        [ $? -eq 0 ] && echo $DESC: IP $ELASTIC_IP associated with $MY_ID || echo $DESC: Could not map IP $ELASTIC_IP to $MY_ID
        ;;
    stop)
        aws $AWS_PARMAS disassociate-address "$ELASTIC_IP" > /dev/null
        [ $? -eq 0 ] && echo $DESC: IP $ELASTIC_IP disowned || echo $DESC: Could not disown $ELASTIC_IP
        ;;
    status)
        aws $AWS_PARAMS describe-addresses | grep "$ELASTIC_IP" | grep "$MY_ID" > /dev/null
        # grep will return true if this ip is mapped to this instance
        [ $? -eq 0 ] && echo $DESC: I have $ELASTIC_IP || echo $DESC: I do not have $ELASTIC_IP
        ;;
esac

Heartbeat

Each server needs the heartbeat package installed:

$ apt-get install heartbeat

Allow heartbeat traffic between your instances:

$ ec2-authorize $group -P udp -p 694 -u $YOURUSERID -o $group # heartbeat

Heartbeat is configured by three files, all in /etc/ha.d, and in our case, all identical on both servers:

authkeys

auth 1
1 sha1 foobarbaz

The authkeys page on the heartbeat wiki offers a script to help generate a key.

ha.cf

# Log to syslog as facility "daemon"
logfacility daemon 

# List of cluster members by short hostname (uname -n)
node server1 server2

# Send one heartbeat each second
keepalive 1 

# Declare nodes dead after 10 seconds
deadtime 10 

# internal IP of the peer
ucast eth0 10.256.256.4
ucast eth0 10.257.257.8

# Fail back, so we're normally running on the primary server
auto_failback on

All pretty self-explanatory: set your own 'node' and 'ucast' entries with your hostnames and internal IP addresses. Even when the external IPs are bouncing around, the internal IPs should stay the same. auto_failback is optional, as mentioned above. Read the docs for more options.

haresources

server1 elastic-ip

Here, we set up a link between the primary server (server1) and the script we want to run (elastic-ip). The wiki shows you what else you can do.

Putting it all together

Start heartbeat on both nodes, and server1 should claim the IP address. Stop heartbeat on server1 (or if server1 crashes), and server2 will notice after 10 seconds and claim the IP address. As soon as server1 is back up, it should claim it back too. You can run /etc/init.d/elastic-ip status to prove this:

server1:~$ sudo /etc/init.d/elastic-ip status
elastic-ip remapper: I have 256.256.256.4
server2:~$ sudo /etc/init.d/elastic-ip status
elastic-ip remapper: I do not have 256.256.256.4

Whatever happens, your elastic IP will always point to a good instance!

Postscript: what Heartbeat will not do

Heartbeat will notice if a server goes away, and claim the IP. However, it will not notice if a service stops running but the machine stays alive. Your good work may all be for nothing!

To solve this, I suggest monit, or if you're a ruby fan, bluepill. These will monitor a service, and restart it if it is not responding.

Posted in Amazon Migration, Technical, Work | 1 Comment »

Migrating your servers to Amazon EC2: Load balancing

October 25th, 2010

When you run a large web site, you probably have a number of machines, across a number of different availability zones, but you need to present a single URL to the user. You distribute the load between your machines with (a redundant pair of) load balancers, and point your DNS to the floating IP of the balancers.

A number of options for doing similar exist for Amazon EC2 users: as a good balance between convenience and performance, we chose to use Amazon's Elastic Load Balancing (ELB) service offering, with a caveat listed below. While a good default position, this may not be for you; check the bottom of this article for some resources to help you choose.

ELB has some great features. As well as the regular load balancer feature of tracking of which backend instances are up, it proactively adds extra capacity (which I term 'nodes', so as not to get confused with backend instances) in the event of increasing load. You can also set ELB up to spin up more backend instances in the case of there not being enough to serve your requests. All this for a small per-hour and per-GB cost.

Side note: You may be thinking "Why not use round robin DNS, and put the IPs of more than one server?" This is a trap for young players; you actually make things worse, because any one of N machines failing means there's a 1/N chance a request goes to a broken instance. There's a good writeup on Server Fault if you want more information.

Then and now

In the old world, our site sat behind a hardware load balancer appliance. Being that we were using a shared device at a co-location provider; I never saw it, and thus can't give you the exact details: but the important part of this story is that when traffic got to our instance, its source IP was still set to the IP of the sender, or at least the last proxy server it went through on it's travels. This matters to us, because, just like Phil Zimmerman's brain, some of Symbian's code is export controlled, due to containing cryptographic awesomesauce. We need to know the source IP of all requests, in case they are requesting our restricted areas.

When you're in EC2, you're operating under their network rules which "will not permit an instance to send traffic with a source IP or MAC address other than its own". This also applies to the instances that run the ELB service. If you set up an ELB, your backend servers will see all their traffic coming from the IP addresses of your ELB nodes, telling them nothing about where it came from before that.

The story that is falling into place largely revolves around the X-Forwarded-For header, which is added to HTTP transactions by proxy servers. Our back-end servers are told the packet arrived from the load balancer, but if you tell ELB that it's using the HTTP protocol on this port, it adds the X-F-F header automatically: the backends can then look at the most recently added entry to the X-F-F and learn the source IP as the ELB knew it.¹

Because the load balancer sits between the client and the server, who are either end of an encrypted transaction, it can't rip open a HTTPS packet and add an arbitrary header. So, we had a Heisenproblem: it was not possible to know where something came from, and have that same something happen securely. And, stuff you are only giving to certain allowed people is exactly the sort of stuff you probably want to distribute over SSL.

There were two possible solutions to this:

Direct secure traffic directly to a backend instance
Wait for Amazon to implement SSL termination on ELB

In order to go live, we did #1. It came with a bunch of downsides, such as having to instruct our cache to redirect requests for certain paths to a different URL, such that if you requested site.example.org/restricted, you were taken to https://site-secure.example.org/restricted. "But what happens when that server goes down?", you say! When I planned this article, it was going to include a nice little description of how we got Heartbeat sharing an elastic IP address, so that we always had our "secure" IP pointing to whichever one of (a pair of) our servers which was up. It's a useful trick, so I'll come back to it later.

However, I'm pleased to announce that since then, Amazon have introduced #2: support for SSL termination, so you can upload your certificate to your load balancer, and then it can add the X-F-F header to your secure packets, and you don't need to worry about it any more.²

I was similarly going to have to worry about how to handle database failover in EC2, but they introduced that between me looking and go-live. I surmise that if you wait long enough, Amazon will do everything for you, and now delay introducing anything! 🙂

Now we know all that, let's dig a little deeper into how ELB works.

A Little Deeper

Amazon is all about the short-TTL DNS. If they want to scale something, they do so, and change what their DNS server returns when you query it.

When you register an ELB, you get given a DNS name such as lb-name-1234567890.eu-west-1.elb.amazonaws.com. You're explicitly warned to set your chosen site name as a CNAME to this; and indeed if you use the IP as it stands now, one day your site will break (for reasons you will learn below.)

First oddity with this setup: you can't CNAME the root of a domain, so you have to make example.org a redirect to www.example.org, preferably one hosted somewhere outside the cloud, as example.org needs to be an A record to an IP address. Some DNS providers have a facility for doing redirects using their own servers, which is an option here.

If you were to query that DNS record you would find that it has a 60 second TTL; thus if you query it twice, 2 mins apart, and you have more than one ELB node³ you may, at the discretion of Amazon's algorithms, get different results. Try this:

$ dig lb-name-1234567890.eu-west-1.elb.amazonaws.com
lb-name-1234567890.eu-west-1.elb.amazonaws.com. 60 IN A 256.256.256.4
$ dig lb-name-1234567890.eu-west-1.elb.amazonaws.com @8.8.8.8
lb-name-1234567890.eu-west-1.elb.amazonaws.com. 60 IN A 257.257.257.8

Dude, where's my balancing?

When you register an ELB, you tell it the availability zones it should operate it. Each AZ has at least one ELB node, and that node will route you to instances in its own AZ, unless there are none available. That, along with the fact you are pseudo-randomly given a IP (with a minimum 60 second TTL), leads to a non-obvious conclusion. This actually happened to us - our policy is that odd numbered servers are in -1a, and even numbered servers are in -1b.

external:~$ ab -n 10 http://lb-name-123.eu-west1.elb.amazonaws.com/test.txt
web1:~$ wc -l /var/log/apache2/access.log
10 /var/log/apache2/access.log
web2:~$ wc -l /var/log/apache2/access.log
0 /var/log/apache2/access.log

That is to say: if your servers are in multiple availability zones⁴, a single user doing requests in quick succession isn't load-balanced across your backend instances, so ELB doesn't appear to be working at all. Thankfully, it is, you just can't see it, because you're not looking from enough places at once. ELB is designed to work for a widely distributed client base, and in that case, you should expect about half the traffic on one instance, and half on the other. If you ran this test from a different location, you might see all 10 requests go to web2.

If you ask Amazon⁵, they can change the DNS for an ELB so that it presents all the IP addresses associated, not just one of them. This means your client has the choice to pick the IP each time it connects, and depending on how your application works, may be better for test servers.

OBEY THE TTL

The prime reason to use an ELB is that Amazon can transparently add more computing power to support your load if needed. The converse of that is that when it is no longer needed, it will be removed. It bears mention that if they take an IP address out of the DNS, it will last at least 60 minutes before being taken out of service. Not everyone obeys a TTL on a DNS zone!

To reiterate: don't ever take what the name currently resolves to, and use that IP. It's not yours and one day it will break.

Migrating your servers to Amazon EC2: Instance sizing

October 11th, 2010

One of the central tenets of cloud computing it's a cheap way to run large-scale compute jobs. If you're more concerned about starting small, and want to tackle the problem of growing big when you get to it¹, then there's still a solution for you, though it might not be quite like the one you're used to.

If you're currently running on a hosted, virtualized platform, you are probably in one of two situations:

Your hosting provider buys servers for you, and runs something like VMware ESX on them
You're dealing with a VPS provider

If you're in the former bucket, as we were, you have a pretty fine-grained control over your instance (virtual server) scaling. You can add more CPU power (or weight one instance to be allowed to burst at the expense of others), and control, sometimes to the megabyte, how much memory is available to your application.

When you're in the latter bucket, you tend to get a number of discrete plans (such as the ones Linode offer), but your provider has a human element, and if you ask nicely, you can probably get a plan with very low disk but very high memory, by paying a little bit extra (RimuHosting tends towards the confusing with the amount of choice they offer!)

Amazon EC2, being an entirely automated provider, doesn't give you the option to customize your plans. They offer a selection of instance sizes, at various prices. Those are the choices, take or leave them.² Because of the ease of creating and using multiple machines, and the relatively low extra cost,³ you have to consider if the cost of scaling up is best for you, compared to the cost of scaling out.

Our applications ran almost exclusively on 32-bit machines. There are a number of reasons, in both theory and practice, why 64-bit may not be for you: lack of vendor support, having to maintain software packages for both 32- and 64-bit architectures, slower performance/more memory use for simple tasks, etc. I prefer to stay with 32-bit across the board, which also suggests horizontal scaling. If your application benefits from 64-bit computing, then you have a slightly different problem to the one I had, and your mileage will vary.

Some figures

Consider, for example, the 'default' instance for new users, the m1.small:

1.7 GB memory
1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit)

This instance costs 8.5c/hour to run.

Side note: With the launch of Canonical's new Ubuntu Server 10.10, they're announcing a "Try Ubuntu Server on our dime" promotion. It's worth noting that they get 1.5c change for that dime. 🙂

The next option up gives you about four times the performance, for about four times the cost. However, you don't get too much insight into what four times "Low" IO performance is, vs "High", and you don't get any redundancy. We decided that we'd rather have two small instances in one AZ, and two in another, to build resilience into our infrastructure for free.

It soon dawned on us that 1 "EC2 Compute Unit", which they claim is currently roughtly equivalent to a "1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor", is roughly equivalent to "not good enough for PHP".

The stolen generation

When you use VMware, you get given a virtual CPU, with a speedo that goes from 0 to 100. With Xen (which is the hypervisor used by Amazon EC2), you can be given a certain percentage of the cycles on the parent CPU, but the gauge you see goes up to the number of cycles you are allowed on the parent CPU, not a percentage of a virtual CPU.

The practical upshot of this is that you end up seeing your CPU maxing out at a certain value (for us, around 40%) - but the other 60% of the cycles are stolen from you to feed other, more deserving, instances. This blog post from Axibase neatly sums up the issues, with graphs. You will see stolen CPU cycles in a new column in 'top':

Cpu(s):  1.1%us,  0.3%sy,  0.0%ni, 96.1%id,  0.1%wa,  0.0%hi,  0.0%si,  2.4%st

Not all tools are aware of steal time: you will see stolen ticks in vmstat -s, but not in the tabular vmstat output. You must have Xen-aware tools in order to get this information; Ubuntu provides them out of the box.

Thankfully, there happens to be a suitable instance for us. Doubling the price from 8.5c to 17c/hour, we get the c1.medium instance:

1.7 GB memory
5 EC2 Compute Units (2 virtual cores with 2.5 EC2 Compute Units each)

This one is twice the price of the m1.small, but has 5 times the CPU. A worthwhile upgrade, and all of a sudden our Apache benchmarks are back up where we expect them.

You might have noticed that both the previous instances have a relatively small 1.7 GB of memory. Want more? You're moving up to 7GB, minimum. If you want to stay with a small application, on 32-bit platform, the c1.medium instance is about where the line ends. We would love an instance type with 4GB of RAM; if you agree, please make yourself known to Amazon. The more customer demand they get, the more likely they are to implement it.

If we get to the point where it suits us, for example, to run two Size 4 machines, rather than eight Size 1 machines, we may consider moving to larger instances; we would save a little on EBS disks and inter-AZ transfer costs, but then a failure on one machine will mean we lose half of our hosting potential, rather than one eighth.

Planning for growth

You don't need to know all this up-front. If an instance is lacking in resource, upgrade it for a bigger/better one. Right?

Earlier in the history of EC2, you couldn't upgrade an instance, because root disks was on the instance-only, or ephemeral, store. If you step back and think of EC2 as actually being a room full of servers, each machine has some (presumably) local hard disk space. That is the ephemeral ("short-lived"; from the Greek for "one day") store. It's big, and it's free. However, when you turn your instance off, it's wiped.

In contrast, EBS is permanent, paid, network-attached storage (think iSCSI).

Before late 2009, your only option was to turn off your current instance, and then spin up a new one, from your template image (AMI). Then, AWS announced an upgrade, which allows you to boot an instance from an EBS disk. This means you can turn your instance off, and the root file system stays there waiting. You can use the web services to instruct that instance to be bigger, or smaller, when it returns. Because of the obvious usefulness of this, and the relatively low cost of a 10GB root disk, we're running all our instances on an EBS root.

When you upgrade your EBS-root instances, you are causing them to change class, which generally means, bring them up on a new physical host machine.

This means one obvious thing:

Your ephemeral disk (/mnt) is wiped

And two "less obvious" things:

Your internal IP address will change
Your internal IP address will change

Technically speaking that's only one "less obvious" thing, but I thought it was such a big one, I thought it was worth mentioning twice.

If you have an elastic IP address attached to that instance, your external IP address will remain the same. However, your instance is now on a different physical host, with a different physical NIC in its host, so it will get a new IP address. For someone who is running a traditional application without cloud-awareness, changing IP can be something which requires thought. For example, if you are upgrading your private DNS server, you will have a problem here. You can't know what the IP address will be before you upgrade, so make very sure you have moved all the services off this machine before you upgrade it. Then, get the new connection details from the console, and reconnect.

As every machine needs an internal IP address, and they are not scarce (Amazon provides them from 10.0.0.0/8, meaning there should be no problem up to about 16 million instances), something that is really missing from EC2 for the "always on" use case we run is static internal IP addresses. Fire up your request e-mails.⁴

I think I pretty much wrote Rework there! ↩
Amazon do often add new instances sizes, so the information in this article may one day be superseded. ↩
In the case of non-EBS instances in the same AZ, there should be no extra cost. ↩
I even offer a suggestion on how to implement them: 5 for free per customer, and one more for every reserved instance you buy. Then, they're issued by DHCP to any instance ID they are registered to. ↩

Posted in Amazon Migration, Technical, Work | No Comments »

Migrating your servers to Amazon EC2: Initial design considerations

October 1st, 2010

Cloud architecture!

Even without making major changes to your application, you can make Amazon EC2 work for you.

Here are some things that I considered when designing our new setup:

No single point of failure

Any one machine should be able to go down - as Amazon CTO Werner Vogels says, "everything fails, all the time". Guaranteed failure makes you think. The parts of the site that are identified as being most important should be able to run even if an entire datacentre fails.

Thankfully, EC2 makes this simple. Availability Zones (AZs) have been described to me as far enough apart that a disaster at one will not affect the other, but close enough that an engineer can drive between them in a reasonable time.

In my experience, the difference in ping times between our eu-west-1a instances and our eu-west-1b instances is less than 1ms. You do pay a "regional data transfer" rate of $0.01/GB for transfer between instances in different AZs in the same region. At that price, it is cost-effective for us to run the system across two AZs. Our load balancing doesn't care which zone the machines are in, so even if one zone fails, then the site is still reachable.

No wasted cycles

You can turn on a machine and turn it off as you see fit; assuming you have an EBS-root instance (and you should), you only pay for the disk while the machine is off. You can also attach that disk to a more powerful instance, should you have a need for a short-term boost of computing power!

Further to that, if we have a second machine running for failover purposes, it should be serving traffic, so that when we're in our good state, we have twice the performance available to us.

No private networking

Amazon network access is controlled by security groups. Instances are assigned to a security group at startup. You can then do things like say "proxy servers may access web servers on port 80", "the public may access proxy servers on port 443", "my office may access everything on port 22".

While Amazon instances know about security groups, your applications don't. You can't allow access to something from the public Internet, and allow more access to it from a nominated network range, on the same port. I'll touch on this more when talking about security and mail servers later in this series.

Amazon offers a Virtual Private Cloud, which allows you to put more machines behind your firewall via an IPsec VPN. It comes with an important proviso that is missing to many first-time readers: you can't access a VPC instance directly from the Internet. There's no way to use VPC as a management VPN, but have the instances on the public Internet - unless you want to accept traffic for those instances on your own servers, in which case you should have more redundant network connectivity than Amazon has, and you now pay for traffic in two places.

You can, of course, run a VPN server on your EC2 instances, or you can require your users have a VPN connection to your office, in order to get trusted access to your EC2 servers.

Size your instances as necessary

We started trying to run as many of our instances as we could on the smallest type (the m1.small), and quickly hit its limitations. However, remember that resizing instances isn't difficult. I'll touch on this later as well.

Use the right levels of redundancy

You can get a lot of benefits if you rethink your application and build it with the cloud in mind, but you can still get a great cost saving and a faster application just by treating EC2 as a big VM farm. For example, we're not using S3 at all, and barely using EBS.

Our root disks are on EBS, but our data is mostly replicated across multiple nodes, so using the ephemeral store - which is otherwise wasted - was perfect for us. Why pay extra to store a Mercurial repository, which has to be in sync across four machines, when each other machine already has a consistent copy by default?

Automate everything

You can register your own disk image (AMI) which you can create instances of. By using a combination of configuration management and locally-developed deployment scripts, we haven't yet had the need to do this.

For us, firing up a new instance involves running a script with a wanted hostname and the instance ID we're given when we create it. This will add the machine to the DNS, SSH to it, install Puppet, register with our puppetmaster and install the machine to the current spec. Our machines auto-register with our monitoring servers.

Once something is totally automated, it can be done automatically, as a result of an external stimulus. For example, when our ELB detects a spike of traffic to the site, you can have it auto-scale and create new instances in response. Even if you don't think you need this now, if you design your system right from the beginning, you're well placed to introduce it later.

Employ the principles of structured system management and your EC2 environment will pass the Joel Test for System Administrators.

Posted in Amazon Migration, Technical, Work | 1 Comment »

Clouded House: How we moved our service to Amazon EC2 without it even knowing, and how you can too

September 29th, 2010

At the Symbian Foundation, we run a number of open-source applications and utilities written both in-house and by our member companies to support the Foundation's goals of building and promoting the Symbian platform. We have servers running LAMP¹ applications such as Symfony, Drupal, MediaWiki, Bugzilla and Mercurial. We also run some Tomcat applications such as OpenGrok and Bugzilla Metrics.

When the Foundation was set up we contracted a company to provide hosting services for us, which were provided on VMware ESX, on dedicated servers in a London datacenter. For various reasons, we decided some months ago it was time to part ways with our current hosting provider. This gave us the chance to make some moves we've been wanting to make for some time; a break from the now-unsupported PHP 5.2 to PHP 5.3 for our LAMP applications, a move to Ubuntu for the operating system, introducing some great configuration management and automation tools (more on those later), and removing some closed source components. It also had the side effect of allowing us to repurpose a rather large sum of money!

The goal in moving our infrastructure was to make the move seamless to the end user; I suspect that if you are reading this and you're an active user of Symbian's web applications, you couldn't even tell me what day this move was made. This project was undertaken with only minimal involvement from our web team, so there was no rewriting of the applications to use file storage in S3; we had to provide a system that worked the same as the previous hosting. This was my challenge, and it was a lot of fun.

What is AWS?

In updating the About page for the UK AWS users group, I wrote:

Amazon Web Services (AWS) is a cloud-based infrastructure platform, offering services such as computing capacity , database and storage, available on demand, priced as a utility, and controlled by web services.

People who like hard-to-pronounce acronyms² have invented the new category of infrastructure as a service (IaaS) in order to separate AWS from other "cloud computing" offerings such as PaaS (think Google App Engine) or SaaS (Salesforce.com). Through a web service call, you say "I want you to give me a VM"; it returns you the ID of your VM, and starts charging you for it. The AWS offerings range from the low-level (VMs, disk, static IPs) to the high-level (message queues, MySQL databases)

Amazon invented this category and are the market leader; there are competitors, such as Rackspace Cloud and GoGrid.com, but after some initial experimentation, the decision to go with Amazon was easy to make.

Lock-in isn't too much of a concern when you're just renting infrastructure - you have root access, and you put it all there yourself, so you know you could pick it up and re-implement it anywhere else if you chose. (If we were using S3, it might be a concern.) However, you know you're the winner when you have an open source implementation of your API; the Eucalyptus project implements the AWS APIs and lets you run your own "private cloud" on your own hardware. Eucalyptus powers Canonical's Ubuntu Enterprise Cloud, which is a great way to get started without needing a credit card.

Does my application need the cloud?

If you are building an application from scratch, it's hard to say no to what cloud computing offers. You can scale from one machine all the way up to Farmville. Netcraft, while they're not busy confirming the death of things, claim that 365,000 web sites are hosted on EC2 as of May this year. It's also much better for your cashflow.

If you're migrating an existing application, your application doesn't run in the cloud today, so it obviously doesn't need what the cloud offers. However, rebuilding an application for a cloud environment (as opposed to just picking the disks up and putting them on virtualised hardware) makes you rebuild things in a way that should, if you do it right, make scaling easy.

Even if you don't need the scalability, think of the potential upsides; you can turn a machine on just to do some processing, and then turn it off again when you're finished.

Because we migrated an existing infrastructure, we aren't really using EC2 as anything more than a VPS provider. Albeit, a super-VPS provider where you can turn instances on and off yourself, clone and snapshot them, and scale them up and down your will. Now we have moved into the cloud, should our developers wish, they can use whatever other services take their fancy.

Economics

If you're running an application that runs 24x7, economically, cloud computing isn't necessarily going to be better, as Google's Vijay Gill commented recently. If your business relies on having thousands of servers, then you can probably benefit from the economies of scale of having your own setup, and if it's your core competency, you probably shouldn't outsource that. However, his figures don't consider the cost of reserved instances, which generally drop the per-unit cost of EC2 instances by at least half.

(I suspect Gill would suggest people with smaller needs use Google's PaaS offering, App Engine, which requires rewriting your application, and using either Python or Java. This series suggests that you don't have the ability or remit to do this.)

When I first drafted this article, I wrote "EC2 can't scale as low as Linode's $20/month plan": since then however, Amazon have announced "micro" instances, which cost next-to-nothing (when reserved, about $14/mo). The support is not exactly Fanatical - like many open source things, your support options are best-effort in the forums, or purchasing Premium Support, at 10/20% of your monthly spend.

However, if you believe, as we do, that cloud computing is the future, and you're running a few dozen servers, then even at non-reserved rates, EC2 probably works out cheaper than traditional commercial virtualised hosting, and offers you such benefits such as being able to use different, worldwide datacenters, at no extra cost.³

That there is no capital expenditure is a bonus for your finance department, but they will probably balk at having to pay by credit card. If your spend is enough, Amazon will put you on invoice billing; until then, fly under the radar, or convince your finance director that the savings are worth an exception to the policy.

A small note on open source vs proprietary software in the cloud

If your product costs more to use the more servers you run it on, cloud computing may not be for you. This reminds me of a post from Jeff Atwood; scaling out costs a lot more when you have to pay for your operating system or software licenses

(You can scale up instances - and if you have your root on EBS, you can do it with an existing machine - but the nearer you get to using a full unshared server, the closer Amazon have to charge you to the cost of a full unshared server, and so you may near the point where building your own server is a better deal.)

Stay tuned

Over the next few weeks I will post a series of articles talking about the problems we faced, the challenges we overcame, and how you can take your application or servers and move them into the EC2 cloud. I will also touch on the things that are not-so-great about each component of the Amazon stack, and make suggestions to how you can work around them, or how Amazon could improve their service. You'll be the first to hear about the brand new file system we've developed for quickly moving files around EC2 instances. In return, I'd love to hear from anyone who has suggestions on how they think we could have done better.

To make sure you get each post as they come out, subscribe to this blog by RSS, or sign up for updates by e-mail.⁴ You should follow me on Twitter here, but only if you're prepared for short bursts of awesome.

using the extended definition to include PHP, Perl and Python ↩
technically, an initialism, unless you want to get quite rude ↩
Things cost slightly more - around 1c/hour - outside of their largest centre, Washington DC. ↩
If you want to make ultra-sure you only read posts relating to our Amazon migration, and don't accidentally read a band review or something, there's a category for that. ↩

Posted in Amazon Migration, Technical, Work | 4 Comments »

Review: Barenaked Ladies at Hammersmith Apollo, 15 September 2010

September 16th, 2010

For a large period of my life - pretty much from the moment I first saw them live in Auckland in '99¹ - Barenaked Ladies were my favourite band. I can't put my finger on when or why they fell from that position – possibly in the quiet period between the release of their somewhat lacklustre Everything to Everyone and the double-CD-in-two-parts that was Barenaked Ladies are Me/Men in 2006/2007 – or even who replaced them; they just wandered out of my playlist, in the way that bands sometimes do. However, they are still undisputedly the best live band I've ever seen (only Green Day have ever come close), and I will jump at the chance to see them play.

The elephant in the room² is the lack of singer/songwriter Steven Page, who departed the band 18 months ago. Last time we saw the band was in their home town of Toronto in December 2008, two months before the split; I'm glad I got to that show, even if it was a bit Christmas-carol heavy. Since then, after a long break, BNL have returned with All In Good Time, a record firmly at the grown-up end of the spectrum (as you might expect from a band whose last studio output was a children's album called Snacktime).

Last night they bought their All In Good Time tour to the Hammersmith Apollo in London, officially making Barenaked Ladies the third band I have now seen in three different countries.³

While as expected there were a number of cuts from both their new album and their hit album Stunt (you know, the one with the "Chikkity China" song on it), the set had a number of tracks from its follow up Maroon. Older songs were fewer - Old Apartment was the first song where Ed Robertson took Steve's lead vocal, and it's worth noting that drummer Tyler Stewart makes a surprisingly good Ed to Ed's Steve.

Strangely the song where I missed Steve the most was one he didn't even sing - the first single of the new album, You Run Away, which is ostensibly about the circumstances surrounding Steve leaving the band he'd been in for 20 years. The end chorus on the record relies on Ed's double-tracked vocals, and the live version was just crying out for a proper duet.

However, the new arrangement worked brilliantly in other places. I'm the first to admit that the voice of Kevin Hearn (a taller, thinner Turtle from Entourage) isn't generally to my taste. Sound Of Your Voice from Barenaked Ladies Are Me is my favourite BNL song since the Maroon days; Kevin wrote this song, but Steve sings it on the album. Since Steve left, Kevin has taken lead vocal again, and even if Kevin was a more powerful vocalist, Page's are very big shoes to fill, especially on a belter such as this. The current arrangement features Kevin playing acoustic guitar, and the other three clustered around a microphone singing the backing vocals in a "doo-wop" style, complete with clicking fingers and synchronized swaying. It was suitably different and it bought new life back to a great song.

Many of the rest of the new songs were treated as bathroom breakers; by comparison, It's All Been Done almost got some of audience jumping up and down where the seats should be. The loudest cheers of the evening were reserved for mentions of all things Canadian, no doubt by the gentleman (and ladies!) all in hockey garb.

A trademark of the BNL experience was the live improvised song/rap, which tonight was about English accents and a kid telling Ed not to steal his bike. (Trust me, they're better in person.) The improvised story in the middle of If I Had A Million Dollars told us that in lieu of him being able to find a park to run around, he was doing laps of the Shepherd's Bush Green, and the descriptive story of his hotel (with "snow room" - more Canadian cheering!) no doubt led some of the more intent fans to camp out in front of it the next morning!

More BNL trademarks include the throwing of underwear (placed on guitars) and Kraft Dinner (thrown from the balcony, and hitting everyone in our vicinity!). Did the "those in the know don't throw" message not get through to those who were super-fan enough to still go through with this?

One of my favourite parts of the old BNL show was the post-Million Dollars medley, a carefully crafted pastiche of current pop songs. I fondly remember the Auckland show, seeing Tyler run up and down the stage singing "Near" and "Far" in the style of Sesame Street, then Steve breaking into "Near, far, wherever you are" from My Heart Will Go On (this was 1999!). From bootlegs I've heard, this was dropped from the set around 2000, but seems to have come back with a vengeance; Kevin's slow start on Oh It's Magic soon joined by a beatboxing Ed and turning into I Got A Feeling by the Black Eyed, Peas, Baby by Justin Bieber and California Gurls by Katy Perry, ending in a triumphant Tyler-carrying-Jim moment.

The encore started great, with Tyler-the-drummer down the front and Ed on drums; it not being Christmas, so his regular Feliz Navidad seemed a bit unlikely: instead, we were treated to a madcap version of Alcohol.⁴ It all got a little muted from that point on; the new Kevin-lead ballad Watching the Northern Lights was treated as another bathroom break, and the finisher Tonight Is The Night I Fell Asleep At The Wheel is a great song, but one I figured Steve would get in the divorce. The new lineup seems freed from the obligation to finish with Brian Wilson every night, although examinations of other setlists suggests they do still play it. (This is not a game I should play, because examinations of other setlists simply make me jealous that they played songs I like on other nights!)

While it's not the same band any more, it's still fun, and I disagree with comments that suggest it's butchering the memory. No more so than having BNL featuring Thin Steve with stylish glasses.

Update: Check out some professional photos of the evening at IES Photography.

I think it was '99; strangely enough I can't find confirmation on the Internet at all! ↩
On the subject of elephants in rooms, I gained another chin eating a diet of hot dogs and poutine in my two years in Canada; I don't mean to generalise, but the room at BNL, on average, appeared to weigh a lot more than other, primarily-English audiences I've been in there. Just sayin'. Eat healthy. ↩
Behind R.E.M. and Crowded House; technically I could include Neil and Tim Finn solo, but I won't. ↩
Bass player Jim Creegan sings lead vocals on one or two songs an album nowadays, and takes live lead on some Steve songs, but didn't get one tonight. ↩

Tags: bnl, music, review
Posted in Personal | No Comments »

OpenID and MindTouch

September 15th, 2010

The below is all outdated - links updated to archive.org for historical interest.

When I was working at Coreworx in Canada I introduced an internal knowledge base in the form of MindTouch (then DekiWiki). I got involved in the community around the project, and ended up meeting the developers behind it in San Diego.

I evaluated it for a project at Symbian, and in order to try and make it suit us better I wrote a module for using OpenID.

Today, MindTouch have published a couple of posts I wrote on the subject, which I am happy to share with you here:

In other news, over the course of the next few weeks, I'll be running a series on the migration of Symbian's LAMP/LTMJ (unfortunately Linux/Tomcat/MySQL/Java doesn't have a vowel in it) hosting servers to Amazon EC2. Stay tuned!

Tags: mindtouch, openid, opensource, programming
Posted in Technical | 2 Comments »

Hat-in-rubber-gloved-hand time

November 20th, 2009

Point, chuckle, and then donate. (Hey, I've got 11 days to go.)

This is my every-three-yearly charity drive. Last time I tried it, my boss made a sizeable donation on the condition I never do it again. (Sorry, Andrew.) Two countries and three years later, I'm at it again - and you should make a donation to prostate cancer research in my honour, because otherwise we just go on not talking about it until Rubber Glove time.

Posted in Personal | No Comments »

How do I change the DHCP subnet for NAT on VMware Fusion 3.0?

November 10th, 2009

There are a couple of helpful blog posts (Nilesh Kapadia and Max Newell deserve a shout-out here) which help you with changing the DHCP settings given to your NAT or host networks on VMware Fusion. However, it all changes in 3.0.

The file you now need to edit is /Library/Application Support/VMware Fusion/networking. In there, you will find these lines:

answer VNET_8_HOSTONLY_SUBNET 192.168.93.0
answer VNET_8_VIRTUAL_ADAPTER_ADDR 192.168.93.1

I believe the third octet (the 93 part) is selected randomly when you install; in any case, I wanted to give out addresses on 192.168.227.0/24, so I changed the configuration like so:

answer VNET_8_HOSTONLY_SUBNET 192.168.227.0
answer VNET_8_VIRTUAL_ADAPTER_ADDR 192.168.227.1

and restarted the network interfaces:

sudo "/Library/Application Support/VMware Fusion/boot.sh" --restart

Now, make a note of the MAC address of your virtual network adapter in your guest OS, and you can assign an entry in the dhcpd.conf file (/Library/Application Support/VMware Fusion/vmnet8/dhcpd.conf). Make sure you do it outside of the area that is marked "this will be overwritten"!

host developer-vm {
    hardware ethernet 00:0c:29:cb:dd:72;
    fixed-address 192.168.227.128;
}

and another service restart.

Tags: dhcp, nat, vmware, vmware fusion
Posted in Technical, Work | 4 Comments »

T minus 5 days

October 7th, 2009

Five days from now we'll be in the air, on the way to Chicago (which is actually in exactly the wrong direction), and then onto London.

(Did I tell you we were moving to London? Oh well, now you know!)

Before then, we have to:

Sell our remaining stuff
Have people who have bought stuff and not collected it, collect it
Put boxed stuff on a boat
Visit our "Canadian family" for Thanksgiving
Fill in the holes in the walls, even though we didn't put them there
Lots of cleaning.

Last Saturday we saw Russell Peters - perhaps not the "world's greatest comedian", as the intro announcer suggested, but definitely a very funny guy. Half his act is race jokes (Peters is Indian, which pretty much lets him riff on whatever racial group or stereotype he wants), and the other half is embarassing the front row, especially couples of mixed ethnicity.

Sunday we had a lovely meal with Fern's old workmates from Manulife Financial, including no less than three different types of dessert. I baked brownies. By which, I mean "Cindy gave us brownie mix a year and a half ago; we bought a brownie pan six months later, and since we're leaving in a week, we should use both". There is only so much you can get wrong in "mix water, oil, an egg, and this packet of powder"; I think it would have been nicer with standard vegetable oil instead of extra-virgin olive oil.

Talking of stand-up comedy, "extra-virgin" is something George Carlin would have had at:

That's another complaint of mine - too much use of this prefix "pre". It's all over the language now — "pre"-this, "pre"-that, place the turkey in a "pre-heated" oven. It's ridiculous! There are only two states an oven can possibly exist in: Heated or unheated! "Pre-heated" is a meaningless fucking term! It's like "pre-recorded" — "This program was pre-recorded." Well, of course it was pre-recorded! When else are you gonna record it, afterwards?

Tags: travel
Posted in Personal | No Comments »