The problem

A while back ago I scored some very cheap “lifetime” VPSs from a cloud VPS provider. Pay once, keep it forever. Sounds too good to be true. The specs were roughly equivalent to a raspberry pi and the cost was one time fee of $14 after promotional coupon. I’ve wasted money on much worse things so I picked up a few to play with; I actually picked up one of these for 7.50 on a super discount day.

For a while these boxes were solely used as a lab and testbed. That was until I decided that I simply couldn’t stand to pay for a dedicated host anymore. Upon trying to move my site over to one of these boxes I quickly noticed how unreliable the VPSs were. I was having to blow away and re-deploy the VPS (and my site) fairly often. This wasn’t a problem when it was all a testbed. The workflow was something along the lines of this: Create VPS, setup something, test/break/fix it, destroy VPS then wash, rinse, and repeat. Now, it’s back to the age old problem of fighting entropy to make a site HA.

The solution

Before I get to the solution I decided on, I do want to acknowledge that there are a quite a few ways to solve this problem depending on what you have for a site. AWS lambda + S3, dropbox, or haproxy are all things that can be used. Results could be rock solid because of AWS' SLAs. Costs I can’t really speak to but there are free tiers that if your site stays within you essentially run for free. So why proceed? Mostly for fun, as an experiment. Other reasons include, moving my domain from a provider that supported SOPA and not wanting to make my site fully static.

So what did I choose? In keeping with the spirit of cheap, I settled on using two VPSs along with AWS::Route 53 weighted resource records and associated health checks.

The basic conept is now:

  • 2 VPSs serving exactly the same copy of my site
  • An http health check for each public IP of the VPSs pointing to the homepage of my site
  • Weighted A-records from ryanmiguel.com and www.ryanmiguel.com to the public IPs of each VPS
    • I have 4 here. 1 record for each DNS name/public IP combination
    • These are attached to their respective heath checks and given even weights

Cost Analysis

Total cost is about $3.40 monthly if you include the domain registration and $4.00 if you amortize the cost of the VPSs over 4 years.

I also want to note that it is possible to get the VPSs on super deals for less than $10

Item Qty Cost type Cost
VPS 2 One time! $14
Domain registration 1 Yearly $12
Amazon Route 53 Health-Check-Non-AWS 2 Monthly $0.75
Amazon Route 53 HostedZone 1 Monthly $0.50
Amazon Route 53 DNS-Queries 1 Monthly $0.40 per 1,000,000 queries for the first 1 Billion queries

HA Analysis

In order to figure out the sites availability I decided to use an external monitor. Cloudwatch would have done nicely but I don’t feel like paying extra monthly fees for metrics and dashboards. I also want per-minute stats as opposed to 5 min intervals which prevents me from using the free tier in services like uptime robot. I settled on using icinga on a RasPi since I already had that running.

So far, since I’ve put this into effect, I’ve only had one node go “down” (unresponsive) which resulted in all traffic being directed to the healthy node. You can see the last weeks worth of http response times up top but please check back later to see my final uptime reports after the year is up.

Pitfalls

This solution is far from perfect. In fact, that is why I say semi-highly available rather than just highly-available. Issues include:

  • Reliance on DNS changes/updates. If a node fails there will inevitably be a time period where it is still returned in a DNS query even though it is unhealthy. Low TTL times help reduce the exposure here but there is no visibility into how many clients were affected during the cutover time.
  • Lack of proper load balancing. While you can implement a weighted balance this fails at determining any sort of real load metric such as number of clients connected.