Copying Zope @ Zope.org since this is useful information.  My numbers
below are approximations, not hard figures.


Its derived from experimental observation.  A python bytecode, on
average, executes about 50 machine instructions.  You probably want to
let a whole CPU quanta expire before voluntarily switching threads. 
Generally a CPU quanta will be about 5 milliseconds.  A 1GHz pentium
will execute  about 1,000,000 instructions / millisecond, or about
100,000 python bytecodes / quanta.  The typical Zope publishing path
is about 1,000,000 bytecodes or more -- so letting that path be
interrupted 10 times or more is overkill (for Zope).  Using my numbers
you could argue for a much higher ratio. (Ie, if you believe me, Zope
"wants" a sys.setcheckinterval(100000) on a 1Ghz machine.  


From experimental observation I have detected a levelling off in
benefit at about pystones/50.  This becomes very noticable on a
multiprocessor machine.  I believe the levelling off effect comes from
other normal 'blocking' operations inside Zope which cause one thread
to suspend.  Hence the factor of 500 discrepancy :)


The rationale is due to overhead in thread switching, and "thruput"
optimization.  Consider the following example:


Two threads wish to count from 1 to 10.  After each thread counts a
single digit, they switch.  A system clock is incremented after each
count:


Sys     Thr1    Thr2

1            1

2                        1

3            2

4                        2

...

19         10

20                     10


The average time for each thread to complete is 19 + 20 / 2, or 19.5.  
Now consider the example where thread 1 is allowed to run to
completion before thread 2:


Sys     Thr1     Th2

1            1

2            2

...

10          10

11                      1

...

20                      20


Here, the average time for each thread to complete is 10 + 20 / 2 or
15.  So, it costs 30% more work to let each thread run "concurrently"
without factoring in any overhead from the actual act of task
switching, which in my example was zero, but can never actually be
zero.


By increasing sys.setcheckinterval (the default Python value is 10!)
we allow more work to be done by each thread before it yields control
to another thread.  The astute observer would also be able to note
that the total system work for CPU BOUND processes can never exceed
the speed of serial processing.  Because Zope is primarily CPU bound,
fewer threads tend to be better.


I believe that a corollary to this is the effect people observe when
Zope undergoes "superlinear" degredation -- ie, too many things get
caught up in Zope (because too many threads are started).  I am sure
this isn't the *only* reason that happens (I dont have a good
observation suite to analyze it).  However, once internal queues for
work build up in Zope, they are very difficult to dissipate -- you
have to have a substantial lessening in the work arrival rate.


N.B. If you use my figure of 1,000,000 bytecodes as a predictor of the
Zope publishing path, you'll realize that this is about 5 cpu quanta
(again using a quanta of 5ms) on a 1Ghz machine which is a Zope
publishing rate of about 40 pages/sec.  For some applications this is
an optimistic value.  For others, Zope can publish at a faster rate. 
This is not intended to cover ALL applications, just a 'good guess' at
one.  I suggest running 'ab' or similar against a representative
sample of YOUR applications pages to convert pages/sec into a
guesstimate of the "cost" of your application.

  

On Monday, June 17, 2002, at 10:05 AM, oliver.erlewein@sqs.de wrote:


<excerpt><fontfamily><param>Arial</param><smaller>Hi</smaller></fontfamily>

 

<fontfamily><param>Arial</param><smaller>I've set my new interval from
"-i 32" to "-i 200" as my Pystones is about 11000. I'll check what
changes I will see. Where did you get that ratio from or why is it so?</smaller></fontfamily>

 

</excerpt>