[Zope] Advice needed: load balancing wih ZEO and Apache on So laris.

Steve Spicklemire steve@spvi.com
Sun, 2 Sep 2001 15:41:46 -0500


Hi Tony,

	I've achieved reasonable (though somewhat naive) ZEO load 
distribution (I won't call it balancing) this way:

In Apache:

01 <VirtualHost 192.xxx.yyy.zzz:80>
02     ServerAdmin steve@spvi.com
03     ServerName test_balance.spvi.net
04     ErrorLog /var/log/spvi.net-error_log
05     CustomLog /var/log/spvi.net-access_log common
06
07
08     RewriteEngine on
09     #RewriteLog    /var/log/rewrite.log
10     #RewriteLogLevel  10
11     RewriteMap    balance_load_ext      
prg:/usr/local/share/apache/conf/balance_load_ext.py
12     RewriteRule   ^/(.*)$ ${balance_load_ext:$1}           [P,L]
13
14 </VirtualHost>

where balance_load_ext.py is:

01 #!/usr/bin/env python
02
03 count = 0
04
05 import sys
06 import string
07
08 def translate(data):
09     global count
10     count = (count + 1) % 3
11     return 
"http://www%i.spvi.net:14080/VirtualHostBase/http/test_balance.spvi.net:80/
%s" % (count, data)
12
13
14 if __name__=='__main__':
15     while 1:
16         data = string.strip(sys.stdin.readline())
17         if not data:
18             break
19         print translate(data)
20         sys.stdout.flush()


This distributes load between three machines, using Apache only. You 
could make the python script smarter to achieve something closer to real 
load balancing with a little effort.

-steve


On Sunday, September 2, 2001, at 01:03 PM, Tony McDonald wrote:

> On 25/8/01 8:55 pm, "sean.upton@uniontrib.com" 
> <sean.upton@uniontrib.com>
> wrote:
>
>> Sounds like you need something to distribute and/or balance the load; 3
>> suggestions:
>>
>> 1 - HTTP Proxy Load Distribution - Squid does this, round-robin, put 
>> squid
>> in front of your ZEO farm (this is what I do).  I don't think that 
>> mod_proxy
>> is capable of doing this.
>
> Sean, Phil,
> Many thanks for the help. I couldn't reply earlier due to family 
> matters.
>
> I downloaded squid and spent quite some time with it. Unfortunately, it
> seems way too complex for me. However, as it has the http-accelerator 
> and
> other facilities, I'll probably look at it again once I get some free 
> time.
>
>> 2 - Layer 4 switch (Cisco LocalDirector, Intel Netstructure, etc) - not
>> cheap, but offers some features like a choice between true balancing, 
>> outage
>> detection, and funky routing techniques like out-of-path return for 
>> quicker
>> network performance.
>
> At the moment we're strapped for cash, so I can't use this option - 
> although
> as we're going to be putting some firewalls in front of our boxes, I 
> *may*
> be able to do this in the future.
>
>> 3 - LVS - Linux Virtual Server Project, attempts to do in software 
>> what # 2
>> does.
>>
>
> Had a look at this, but we're a Solaris shop and there seemed too many
> kernel patches needed to do this.
>
>> Personally, I'm in favor of squid because it is cheap and easy, if all 
>> your
>> machines are of equal weight in performance terms.  My company uses 
>> squid in
>> front of multiple ZEO clients, bypassing Apache altogether for access 
>> to
>> Zope; we use pyredir as a redirector to rewrite URLs like you would 
>> with
>> mod_rewrite; Squid is very nice as a Zope virtual host front-end, and
>> provides ACLs for security purposes, like blocking out the ZMI from 
>> public
>> access.
>
> This is the main reason I spent that time looking at squid - the 
> security
> aspect. But, we've built up quite a bit of experience with Apaches
> ReWriteRules and have a requirement for PHP and (cough) Perl scripts to 
> run
> alongside our Zope sites as well.
>
>> Getting load balancing working in squid is simply a matter of
>> compiling it with a flag to specify you want external DNS resolution 
>> support
>> and using squid's built-in name resolving program dnsserver, which 
>> will look
>> at /etc/hosts and round-robin among multiple IPs that have the same
>> hostname.  One caveat: if you put Squid (or any high-volume TCP app) on
>> Solaris, make sure to tune your TCP connection buffers, so Solaris 
>> doesn't
>> choke; this is one (albeit trivial) reason we use Linux 2.4 for our 
>> proxies
>> instead of Solaris.
>>
>
> Ah - thanks for that - I'll do the /etc/system thing with the TCP/IP
> parameters.
>
>> Since squid round-robin's it doesn't deal with node reliability 
>> problems,
>> which means you will have to rely upon another HA mechanism on the ZEO
>> client nodes themselves, like heartbeat from the Linux-HA project, 
>> which at
>> some point will ported to Solaris, from what I hear.
>>
>
> We're going to be getting some high availability software that does
> heartbeat monitoring so I think I'm ok there (but see *** below).
>
>> If you have a decent budget, but not a lot of time, look at a L4 
>> switch (we
>> use for another application, and like, the Intel Netstrcuture 7140);
>> otherwise, if you have budget constraints consider LVS or Squid, with 
>> Squid
>> being the likely easiest path in terms of setup time.
>>
>> Sean
>>
>
> Thanks very much for the info Sean. Although I'll probably not be using
> squid at the moment - all this is very helpful. In the end I used the 
> method
> that Phil mentioned;
>
>> Anyway on with the show:
>>
>> Take a look at this document,
>> http://httpd.apache.org/docs-2.0/mod/mod_rewrite.html.  Search for
>> 'Randomised Plain Text' and there's your recipe.
>>
>> Basically, you need to create a file that has these lines in, call it
>> map.txt (the name doesn't really matter though):
>>
>> localhost    port1|port2|port3
>>
>> replace port1 .. port 3 with the exact ports you are using:
>>
>> localhost    8080|8081|8082
>>
>> You can put as many options on this line as you need.
>>
>> Then change your final rewrite rule to be like this:
>>
>>   RewriteMap servers rnd:/path/to/file/map.txt
>>   RewriteRule ^/(.*)
>> http://localhost:$(servers:localhost)/VirtualHostBase/http/myserver.ncl.
>> ac.u
>> k:80/VirtualHostRoot/$1 [P]
>>
>> This will put a random port number into the line thereby giving you 
>> pseudo
>> 'round-robin' functionality.  The ${servers:localhost} is the clever 
>> part,
>> 'servers' is the name of the map, and localhost is a parameter telling 
>> the
>> map which line of the map.txt file to choose from.
>
> as I have quite some experience with Apache and Rewriterules.
>
> *** We're only going to have two machines in our 'cluster', but they are
> multi processor machines. The reason I'm having to go through all these
> hoops is basically down to the poor performance of Sparc chips. My 
> pystone
> ratings on our new server are about 4500, whilst my own TiPB is 6500 
> and a
> PIII 700MHz is about 10,500. Therefore I'm trying to squeeze more
> performance out of our boxes.
>
> Once again guys, thanks for the help.
>
> Now all I need to do is figure out how to do Core Session Tracking with 
> ZEO
> (I know there are HowTos - but they're not that transparent to my poor 
> head!
> :)
>
> Cheers,
> Tone.
> --
> Dr Tony McDonald,  Assistant Director, FMCC, http://www.fmcc.org.uk/
> The Medical School, Newcastle University Tel: +44 191 243 6140
> A Zope list for UK HE/FE  http://www.fmcc.org.uk/mailman/listinfo/zope
>
>
> _______________________________________________
> Zope maillist  -  Zope@zope.org
> http://lists.zope.org/mailman/listinfo/zope
> **   No cross posts or HTML encoding!  **
> (Related lists -
>  http://lists.zope.org/mailman/listinfo/zope-announce
>  http://lists.zope.org/mailman/listinfo/zope-dev )