Copying Zope @ Zope.org since this is useful information. My numbers below are approximations, not hard figures. Its derived from experimental observation. A python bytecode, on average, executes about 50 machine instructions. You probably want to let a whole CPU quanta expire before voluntarily switching threads. Generally a CPU quanta will be about 5 milliseconds. A 1GHz pentium will execute about 1,000,000 instructions / millisecond, or about 100,000 python bytecodes / quanta. The typical Zope publishing path is about 1,000,000 bytecodes or more -- so letting that path be interrupted 10 times or more is overkill (for Zope). Using my numbers you could argue for a much higher ratio. (Ie, if you believe me, Zope "wants" a sys.setcheckinterval(100000) on a 1Ghz machine. From experimental observation I have detected a levelling off in benefit at about pystones/50. This becomes very noticable on a multiprocessor machine. I believe the levelling off effect comes from other normal 'blocking' operations inside Zope which cause one thread to suspend. Hence the factor of 500 discrepancy :) The rationale is due to overhead in thread switching, and "thruput" optimization. Consider the following example: Two threads wish to count from 1 to 10. After each thread counts a single digit, they switch. A system clock is incremented after each count: Sys Thr1 Thr2 1 1 2 1 3 2 4 2 ... 19 10 20 10 The average time for each thread to complete is 19 + 20 / 2, or 19.5. Now consider the example where thread 1 is allowed to run to completion before thread 2: Sys Thr1 Th2 1 1 2 2 ... 10 10 11 1 ... 20 20 Here, the average time for each thread to complete is 10 + 20 / 2 or 15. So, it costs 30% more work to let each thread run "concurrently" without factoring in any overhead from the actual act of task switching, which in my example was zero, but can never actually be zero. By increasing sys.setcheckinterval (the default Python value is 10!) we allow more work to be done by each thread before it yields control to another thread. The astute observer would also be able to note that the total system work for CPU BOUND processes can never exceed the speed of serial processing. Because Zope is primarily CPU bound, fewer threads tend to be better. I believe that a corollary to this is the effect people observe when Zope undergoes "superlinear" degredation -- ie, too many things get caught up in Zope (because too many threads are started). I am sure this isn't the *only* reason that happens (I dont have a good observation suite to analyze it). However, once internal queues for work build up in Zope, they are very difficult to dissipate -- you have to have a substantial lessening in the work arrival rate. N.B. If you use my figure of 1,000,000 bytecodes as a predictor of the Zope publishing path, you'll realize that this is about 5 cpu quanta (again using a quanta of 5ms) on a 1Ghz machine which is a Zope publishing rate of about 40 pages/sec. For some applications this is an optimistic value. For others, Zope can publish at a faster rate. This is not intended to cover ALL applications, just a 'good guess' at one. I suggest running 'ab' or similar against a representative sample of YOUR applications pages to convert pages/sec into a guesstimate of the "cost" of your application. On Monday, June 17, 2002, at 10:05 AM, oliver.erlewein@sqs.de wrote:
Hi I've set my new interval from "-i 32" to "-i 200" as my Pystones is about 11000. I'll check what changes I will see. Where did you get that ratio from or why is it so?
participants (1)
-
Matthew T. Kromer