A fail over system is where the service "fails over" to the backup when a problem is detected. It does not balance anything, least of all load.
I think the point Sean was trying to make was that you may need to arrange for failover for the load balancer itself. At least you would if you think it's important to provide redundancy in whatever load balancing setup you implement.
I'm highly interested in any real-world, in-production load balanced or fail over systems for Zope (esp. using Open Source software).
I have moderate experience with LVS + Mon in combination with Squid and Zope. A single Squid handles HTTP requests from clients on the Internet. It talks to a TCP port on the load balancer. LVS as the load balancer provides the balancing service between a number of Zopes. Mon provides the capability to remove failed Zopes from the load balancing rotation via polling often against a method that returns a known response. An extension of this configuration which I've not implemented yet (but will need to very soon) will be to put a separate load balancer in front of a number of ICP-connected Squids, each configured with the virtual IP on the LVS box as its http_accel port. I need to do this to be able to scale cache services and provide redundancy in cache services instead of having a single Squid frontending the whole shooting match.
Is there a way to use URL Rewriting rules in Apache (with mod_rewrite) to test if a particular box was alive, and only if so, direct traffic there? Maybe have it look if a particular file exists (or some such)?
If you find out, please let me know, that sounds very useful!
(Also note that Apache will do much of what Squid will do using mod_proxy.)
mod_proxy isn't very well documented and doesn't do ICP (which is pretty handy for scaling the cache). But it's fine for small setups. I imagine you could even share its cache directory over NFS if you wanted some failover capability without losing all that was cached to disk. - C