Clustering an Amazon Elastic IP address

If you have a problem that Amazon's Elastic Load Balancing can't solve, you might want to do the old fashioned "two machine IP failover" cluster.

Amazon instances only have one internal, and one external, IP address at a time. Consider this:

Instance 1: 256.256.256.4 [Elastic IP]
Instance 2: 257.257.257.8

If you claim the elastic IP on instance 2, then a new IP will be allocated to instance 1:

Instance 1: ¿?
Instance 2: 256.256.256.4 [Elastic IP]

You won't know what it is unless you query the web services, or look at the console, for instance 1. Be sure you are aware of the implications of this before proceeding.

I found a forum post from Alex Polvi which, with some tidying, does the job nicely. When the slave node realises that its master mate has gone offline, it will claim the IP address; when the master returns, you can have the master claim it back, or you can have the slave just become the new master.

Claiming the shared/elastic IP

Your script needs a command that the master machine can call to claim the elastic IP address. Alex's example uses Tim Kay's 'aws' script, which doesn't require Java like the official Amazon ec2-utils.

You need /root/.awssecret to contain the Access Key ID on the first line and the Secret Access Key on the second line:

AK47QWERTY7890ASDFG0H
01mM4Rkl4RmArkLArmaRK14rM4rkL4MarKLar

You can now test this:

$ export AWS_PARAMS="--region=eu-west-1"
$ export ELASTIC_IP=256.256.256.4
$ export MY_ID=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
$ aws $AWS_PARAMS associate-address "$ELASTIC_IP" -i "$MY_ID"

The MY_ID command uses the instance data service to get the instance ID for the machine you're running on, so you can use this script, unedited, on both machines.

This should claim the IP 256.256.256.4 for the instance on which the script is run.

In order for Heartbeat to be able to use this script, we need a simple init script. When run with 'start' it should claim the IP, and when run with 'stop' it should relinquish it. You will need to edit the parameters at the top (or better yet, put them in /etc/default/elastic-ip and source that in your file). Remember to ensure this script is executable.

/etc/init.d/elastic-ip

#!/bin/bash
DESC="elastic-ip remapper"
MY_ID=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
ELASTIC_IP="256.256.256.4"
AWS_PARAMS="--region=eu-west-1"

if ! [ -f ~/.awssecret ] && ! [ -f /root/.awssecret ]; then
    echo "$DESC: cannot find ~/.awssecret or /root/.awssecret"
    exit 1
fi

case $1 in
    start)
        aws $AWS_PARAMS associate-address "$ELASTIC_IP" -i "$MY_ID" > /dev/null
        [ $? -eq 0 ] && echo $DESC: IP $ELASTIC_IP associated with $MY_ID || echo $DESC: Could not map IP $ELASTIC_IP to $MY_ID
        ;;
    stop)
        aws $AWS_PARMAS disassociate-address "$ELASTIC_IP" > /dev/null
        [ $? -eq 0 ] && echo $DESC: IP $ELASTIC_IP disowned || echo $DESC: Could not disown $ELASTIC_IP
        ;;
    status)
        aws $AWS_PARAMS describe-addresses | grep "$ELASTIC_IP" | grep "$MY_ID" > /dev/null
        # grep will return true if this ip is mapped to this instance
        [ $? -eq 0 ] && echo $DESC: I have $ELASTIC_IP || echo $DESC: I do not have $ELASTIC_IP
        ;;
esac

Heartbeat

Each server needs the heartbeat package installed:

$ apt-get install heartbeat

Allow heartbeat traffic between your instances:

$ ec2-authorize $group -P udp -p 694 -u $YOURUSERID -o $group # heartbeat

Heartbeat is configured by three files, all in /etc/ha.d, and in our case, all identical on both servers:

authkeys

auth 1
1 sha1 foobarbaz

The authkeys page on the heartbeat wiki offers a script to help generate a key.

ha.cf

# Log to syslog as facility "daemon"
logfacility daemon 

# List of cluster members by short hostname (uname -n)
node server1 server2

# Send one heartbeat each second
keepalive 1 

# Declare nodes dead after 10 seconds
deadtime 10 

# internal IP of the peer
ucast eth0 10.256.256.4
ucast eth0 10.257.257.8

# Fail back, so we're normally running on the primary server
auto_failback on

All pretty self-explanatory: set your own 'node' and 'ucast' entries with your hostnames and internal IP addresses. Even when the external IPs are bouncing around, the internal IPs should stay the same. auto_failback is optional, as mentioned above. Read the docs for more options.

haresources

server1 elastic-ip

Here, we set up a link between the primary server (server1) and the script we want to run (elastic-ip). The wiki shows you what else you can do.

Putting it all together

Start heartbeat on both nodes, and server1 should claim the IP address. Stop heartbeat on server1 (or if server1 crashes), and server2 will notice after 10 seconds and claim the IP address. As soon as server1 is back up, it should claim it back too. You can run /etc/init.d/elastic-ip status to prove this:

server1:~$ sudo /etc/init.d/elastic-ip status
elastic-ip remapper: I have 256.256.256.4
server2:~$ sudo /etc/init.d/elastic-ip status
elastic-ip remapper: I do not have 256.256.256.4

Whatever happens, your elastic IP will always point to a good instance!

Postscript: what Heartbeat will not do

Heartbeat will notice if a server goes away, and claim the IP. However, it will not notice if a service stops running but the machine stays alive. Your good work may all be for nothing!

To solve this, I suggest monit, or if you're a ruby fan, bluepill. These will monitor a service, and restart it if it is not responding.

This entry was posted on Wednesday, October 27th, 2010 at 1:35 pm and is filed under Amazon Migration, Technical, Work. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

One Response to “Clustering an Amazon Elastic IP address”

R.I.Pienaar says:

November 20, 2010 at 11:22 am

If you use the newer heartbeats you can get service monitoring as well.

Reply

Off the record - Craig Box