[Zope] system requirements

Tue, 28 May 2002 11:35:15 -0700

I'm deploying a new (eventually to be very-heavily loaded) ZEO storage
server on a DP Xeon/Prestonia box (running Linux 2.4.18 compiled from source
from a RedHat Errata kernel source RPM with low latency scheduling enabled);
the box has hardware-multi-threading capabilities; ZEO will be serving what
I anticipate will be a FileStorage storage of several gigabytes sitting on a
8-spindle RAID 10.  All ZEO traffic will be coming from networked ZEO
clients on a Copper-GB network. 

One of the thoughts that I had was that it seemed at least theoretically
possible that if the box is very loaded, that the HMT would help in terms of
using spare CPU cycles in handling interrupts from the GB NIC and RAID
Controller, but I worry that enabling HMT will compromise Python
performance.  Any thoughts on whether HMT/HyperThreading will help or hinder
ZEO performance given Python's GIL?  Could I get an accurate picture of any
degradation by simply running pystone tests with HyperThreading disabled and
enabled?

In comparing kernel source, it looks that, unlike the stock kernel, RedHat's
2.4.18-4 kernel source has a reworked kernel/sched.c that has
set_cpus_allowed() call, which should, in theory, prevent migration of a
task from a CPU.  Has anyone tried this with Python?

Sean

-----Original Message-----
From: Matthew T. Kromer [mailto:matt@zope.com]
Sent: Tuesday, May 28, 2002 9:16 AM
To: Dennis Allison
Cc: Tom Nixon; zope@zope.org
Subject: Re: [Zope] system requirements

Dennis Allison wrote:

>I'm about to put together a production machine as well and so have been
>thinking about the issues and cost/performance trade-offs.  Comments on 
>hidden gotchas would be particularly welcome.  Have there been problems 
>with dual processor Linuz systems and ZServer?
>
>Software:  Zope 2.5.X and CMF 1.3, ZServer + Squid  No Apache -- this 
>will be a dedicated Zope system.  RH 7.2 with lots of cruft removed.
>MySQL.
>
>Hardware:   Dual AMD Athlon processors (2+ GHz)
>            Tyan motherboard with 760 chipset?
>	    1GB DDR memory with ECC. (2GB if budget allows) 
>	    10/100 Ethernet
>            CDROM (for loading...)
>	    Floppy (for bootloading/rescue)
>            40 GB+ disk (probably IDE)
>            SCSI
>            Display controller (for configuration)
>	    2U rackmount box with big fans 
>
>On Tue, 28 May 2002, Matthew T. Kromer wrote:
>  
>

I do *not* recommend running Zope on multiprocessor machines without an 
ability to restrict Zope to execution on a single CPU.

The reason for this is that the Python Global Interpreter Lock is shared 
inside a Zope process.  However, threads in Python are backed by 
underlying OS threads.  Thus, Zope will create multiple threads, and 
each thread is likely to be assigned to a different CPU by the OS 
scheduler.  However, all CPUs but one which are dispatching any given 
Zope process will have to then wait and attempt to acquire the GIL; this 
process introduces significant latency into Python and thus into Zope.

Linux has no native mechanism for processor binding.  In fact, there is 
a CPU dispatch mask for processes, but there is no facility to set the 
mask that I know of.  Solaris can use a system command like 'pbind' to 
bind a process to a particular CPU.

If you *do* go with a multiprocessor, be sure and tune the 
os.setcheckinterval() call made by z2.py.  I think it should be roughly 
your pystone number divided by 50.  This will set the coarse scheduler 
granularity of Python  so that it doesn't attempt to move the lock as 
often. This will allow Zope to do as much work as it can without 
injecting latencies shuffling the GIL off between threads on other CPUs.

In fact, setting os.setcheckinterval() to pystones/50 is probably a good 
idea regardless of what platform you are on.  The current default is 
120, which is better than the Python default of 10 -- but still means 
there is a lot of unnecessary lock shuffling on fast CPUs (the number 
120 was reasonable 2 years ago, but isn't now -- an  Athlon XP 2000+ 
should probably have a number of about 500).  The basic idea is to allow 
some particular time quanta to expire before moving the lock -- but the 
counter is expressed in python bytecode ops.  As CPUs get faster, the 
quanta is shrinking, rather than remaining constant -- or in other 
words, overhead is growing linearly with CPU speed improvements unless 
you make this tuning change.

-- 
Matt Kromer
Zope Corporation  http://www.zope.com/ 

_______________________________________________
Zope maillist  -  Zope@zope.org
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )