Ok, I've got some thoughts. Mainly, I think there are times when a cluster tier should manage its own availability, and times when it should be managed by a forward tier...
Derek Writes:>>>>> Is there a way to use URL Rewriting rules in Apache (with mod_rewrite) to test if a particular box was alive, and only if so, direct traffic there? Maybe have it look if a particular file exists (or some such)? (Also note that Apache will do much of what Squid will do using mod_proxy.) <<<<<
Squid is more flexible here since it uses an external redirector script that can be written in any language that you want; redirector scripts in Squid can be as few as 4-5 lines of Python code, or an elaborate C program. In order to so this in Apache, you need to write a perl module to control mod_rewrite, I believe (a Squid redirector plugin is easier).
Derek Writes:>>>>> I.P. address take-overs are dangerous. What if the Zope processes die, but the O.S. is okay? You'll have an I.P. address conflict unless you can run a script on the primary box that tells it to shut down it's network interface. So what if the hardware locks up/loses all resources/gets into a loop of somekind? The NIC will still respond to its I.P. address, but you can't run the script to disable it. Bad situation--pray you have a watchdog card for those Zope processes. <<<<<
No, you won't have an IP address conflict. IP address takeover via gratuitous/unsolicited ARP prevents an IP conflict, and its a standard documented in RFCs that switch vendors are supposed to obey. If a node running monitoring software is acting as a backup for a failed node and sees that Zope has died on its peer, it will initiate a takeover with the clustering software. Once this takeover happens, the NIC will NOT respond to its own IP address, because the switch will NOT be sending Ethernet frames to it in the first place, because switches and hosts keep ARP tables. I promise you, this stuff works.
Derek Writes:>>>>> MAC address takeovers are somewhat dangerous, because the switch that you are connected into (such as at a Data Center) may not recognize the MAC address takeover if the NIC on the primary box is still responding (as above). <<<<<
Some switches behave better than others in this regard. IP address takeover is easier to deal with than MAC address takeover, and isn't so picky about hardware. Fortunately, most open source clustering software uses IP takeover, not MAC takeover.
Derek Writes:>>>>> I prefer solutions that keep all nodes (primary, backup, or any peer nodes) behind a NAT. Each node gets its own 192.168.0.x I.P. address, and the NAT box does all failover. You've now moved the I.P. takeover problem to the NAT box (with its backup), but since NAT is in the kernel (under Linux, at least) you'd be hard-pressed to find a NAT box that could respond to an ICMP or serial-port ping but not do NAT. If the kernel is running, it's running, and if it's not, it's not. <<<<<
NAT isn't as flexible as proxying, though I imagine it might be faster in some cases - though in both you can bridge to a private network. Perhaps the problem of reliability of many web nodes in a cluster is best dealt with at the NAT/Proxy/L7 switch level, but there is also validity in IP address takeover. The one disadvantage in an IP address takeover, though, is that a backup server in a load-balanced arrangement will take on twice the load. Where IP takeover mechanisms might be more appropriate is in 2-box clusters for things like db/file/proxy servers. Toby's ICP patch looks __really, really cool__. My next setup is likely to use IP takeover clustering on a pair of Squid proxy servers, which themselves load-balance ZEO client nodes using ICP, with the backend storage (file/ODB/RDB) tier as a pair of storage servers also using IP takeover. The reliability of the web server nodes would be dealt with by Squid, thanks to ICP, which would free up the necessity to have a complex IP takeover arrangement for all my web servers, other than making each node have monitoring for a half-dead Zope server. I guess what I'm saying is there are places where IP-takeover based clustering is appropriate, sometimes even in conjunction with forward traffic direction. Sean
-> I guess what I'm saying is there are places where IP-takeover based -> clustering is appropriate, sometimes even in conjunction with forward -> traffic direction. Sean: Thanks for the I.P. Takeover info! While we're on the topic, I have a quick 'opinion' question about clusters. This applies directly to a Zope cluster I'll be building soon. A fully redundant, yet not load-balanced, H.A. system requires *almost* all the same hardware and software as a 2-node load-balanced system. That is, you need to detect a service failure and, if found, make sure traffic goes to the backup system, not the primary system. (You'd also want to send an alert, etc., and if you only have dual redundancy, you'll also want to monitor the backup to make sure it'll be there when the failover is needed.) In a load balanced system (with homogenous nodes), you need to watch all nodes for failure and, if found, fail out that particular node. But instead of the backup hardware going "wasted", just waiting for a failover, you've halved the hardware workload by distributing the work to both machines. This may result in a faster response to endusers. My question: Does it ever make sense to set up a redundant system without load balancing? After all, plopping in new nodes on an as-needed basis is a very handy feature. The only thing I can think of is this: Imagine a site that must serve 1 zillion requests per day (a zillion being a Very Big Number). If you use a simple failover system, then you buy two boxes, both capable of handling 1 zillion requests. If the site grows more popular so you must handle 2 zillion requests/day, you just upgrade both servers to handle 2 zillion requests/day. (This is a thought experiment, ignore the fact that you should have planned for the growth in the first place :) Now imagine those same two (1 zillion/day capable) boxes have been configured for load balancing. Immediately, each server is only serving .5 zillion requests/day. As the site grows to it's new 2 zillion/day load, both servers being in use means no hardware upgrade is needed. BUT --and this is a big but-- you no longer have an H.A. system. You've lost your redundancy. If one of the servers go down, about half of your customers will get an HTTP 502 "Overloaded" error message. So to keep full redundancy, you actually need THREE nodes. In fact, for however many nodes you want your cluster to be, you need to add one extra "redundant" node that would handle the traffic for any failed node (just until that failed node is repaired). So I guess I answered my own question: in a two-node load balanced system, the second node would really be nothing more than a backup node (even though it's handling traffic), and thus you'd need to upgrade your hardware (or rather, just add more nodes) as soon as your traffic exceeded the limit of (nodecount - 1) * traffic_per_node And the "traffic_per_node" you'd have to assume would be peak usage traffic, i.e. ( total_peak_traffic / (nodecount - 1) ) In the real world of public websites, however, I think a load-balanced system may actually offer extra redundancy. Because no site will get its peak load on a 24/7 basis, the load balanced system can use any extra resources (which are free because the cluster is not at 100% capacity) to fill in for the failed node -- and this would be *in addition to* your extra failover node. In a simple failover-system, you don't get this. Any additional comments would be greatly appreciated. --Derek
participants (2)
-
Derek Simkowiak -
sean.upton@uniontrib.com