customink / is-it-up

A Ruby gem for adding a simple endpoint to see if your application "is up”.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Checking for a Maintenance File

clumpidy opened this issue · comments

We frequently need to take a specific server out of production to patch, reboot, or do other maintenance. At the moment we accomplish this in different ways depending on the type of load balancer we're using. It would be awesome if is_it_up would check for the existence of some sort of maintenance file and return a 503 if the file exists. Then regardless of what type of load balancer we were using we could take a server out of production just by creating the file.

I wonder if something like this could be useful: https://github.com/biola/turnout

That looks like it puts an application into maintenance mode. I want the server to continue to process requests, but for the load balancer to stop sending traffic there. That way there isn't a disruption. Existing requests finish, but no new ones come in (except for monitoring and testing).

The maintenance file seems like a decent approach.

  1. Where would the maintenance file live? Maybe a few places could be checked: /tmp/, the application's root directory (e.g. Rack::Directory.new('').root) , and app root dir/tmp/ ?
  2. What should the file be named? Maybe the name is arbitrary while the file extension matters?

I won’t be available to work on this until after 8/11, but PRs are welcome.

We had a quick discussion around this. I don't think we want it in /tmp because that gets cleared on reboot, so a server would end up back in production after a reboot.

Putting it in the application's tmp directory would be a problem for servers running more than one app.

So we could have chef create a /usr/local/maintenance directory and then we'll drop a maintenance.html page in there if the server should be in maintenance.

If Chef is going to be managing the fact that a server is not to be routed to, would an ENV variable work?

Wouldn't the app have to be restarted if the ENV variable changed?

Yes, I suppose you are right.

Since Rails and Passenger (I think) have sort of a maintenance page support already and you aren't putting the application into maintenance, I'm hesitant to implement this using that pattern. That is why I'm fishing for ideas. It might get confusing to step onto that standard practice.

I also think that this functionality might be overstepping the bounds of what this Gem was intended to do: Report that the app is "up".

It also feels weird to me for an individual app node to be responsible for its own inclusion in the load balancer. Is there not a better way to handle this at the ELB level? Should an app really care about whether its getting load balanced or not?

We actually use a number of different load balancers (ELBs, varnish, and ldirectord). One thing they have in common is that they monitor a given URL for a 200 response code to tell if a server is healthy and receive traffic. So by switching that response code we can disable traffic to a server regardless of which load balancer we're using.

I actually think this aligns with the purpose of this gem which was to give a URL that load balancers could monitor to decide whether traffic should be sent to a server.

503 Service Unavailable
The server is currently unavailable (because it is overloaded or down for maintenance). Generally, this is a temporary state.

(Wikipedia)

So, I agree that responding with a 503 aligns with the purpose of this gem, and sounds like it’ll solve a problem WebOps has. I would like the maintenance file path and file name to be configurable in some fashion, though, instead of being hardcoded. That way anyone using this public gem can decide where the file should reside.

Also, PRs are welcome if anyone has the time to spare.