Test runner: layers, subprocesses, and tear down

Benji York

3 Jul 2008 3 Jul '08

9:22 p.m.

I'm working on making the zope.testing test runner run tests in parallelized subprocesses. The option will likely be spelled -j N, where N is the maximum number of processes. I have it basically working, but have noticed a couple odd corners of the test runner that I'd like to clean up. They may be controversial, so I'll ask about them here first. I'd like to 1) remove the layer tear-down mechanism entirely, and 2) make (almost) all layers run in a subprocess. I want to do #1 because it would simplify the test runner code and no one seems to be using the functionality anyway. It also appears from reading the code that any tests run in a subprocess (and most are) will never exercise the tear-down mechanism anyway. #2 will add some process start-up overhead, but it'll only be one more process than is already started (and any reasonably-sized test corpus already starts several processes for each test run). The one exception is for running with -D or with a pdb.set_trace() embedded in the code under test. For that case we need a switch to say "don't start any subprocesses at all", I suspect that will be spelled -j0. For motivation, some speed comparisons: running a particular test suite with 3876 tests (mostly doctests, and mostly functional) without the patch takes 6 minutes, 42 seconds; my branch runs the same tests in 3 minutes and 22 seconds (give or take) on a dual-core box with 3 simultaneous subprocesses. -- Benji York Senior Software Engineer Zope Corporation

Show replies by date

Christian Theune

3 Jul 3 Jul

9:37 p.m.

New subject: [Zope-dev] Test runner: layers, subprocesses, and tear down

On Thu, 2008-07-03 at 17:22 -0400, Benji York wrote:

...

I'm working on making the zope.testing test runner run tests in parallelized subprocesses. The option will likely be spelled -j N, where N is the maximum number of processes.

I have it basically working, but have noticed a couple odd corners of the test runner that I'd like to clean up. They may be controversial, so I'll ask about them here first.

I'd like to 1) remove the layer tear-down mechanism entirely, and 2) make (almost) all layers run in a subprocess.

I want to do #1 because it would simplify the test runner code and no one seems to be using the functionality anyway. It also appears from reading the code that any tests run in a subprocess (and most are) will never exercise the tear-down mechanism anyway.

+1 in general but -1 on removing the tear down functionality. We use it to destroy external databases that where generated for setup.

...

#2 will add some process start-up overhead, but it'll only be one more process than is already started (and any reasonably-sized test corpus already starts several processes for each test run). The one exception is for running with -D or with a pdb.set_trace() embedded in the code under test. For that case we need a switch to say "don't start any subprocesses at all", I suspect that will be spelled -j0.

+1 as well. I'm actually wondering whether we might be able to control the pdb through a sub-process.

...

For motivation, some speed comparisons: running a particular test suite with 3876 tests (mostly doctests, and mostly functional) without the patch takes 6 minutes, 42 seconds; my branch runs the same tests in 3 minutes and 22 seconds (give or take) on a dual-core box with 3 simultaneous subprocesses.

Yay! -- Christian Theune · ct@gocept.com gocept gmbh & co. kg · forsterstraße 29 · 06112 halle (saale) · germany http://gocept.com · tel +49 345 1229889 7 · fax +49 345 1229889 1 Zope and Plone consulting and development

Benji York

9:44 p.m.

New subject: [Zope-dev] Test runner: layers, subprocesses, and tear down

On Thu, Jul 3, 2008 at 5:37 PM, Christian Theune <ct@gocept.com> wrote:

...

On Thu, 2008-07-03 at 17:22 -0400, Benji York wrote:

...
I'd like to 1) remove the layer tear-down mechanism entirely, and 2) make (almost) all layers run in a subprocess.

I want to do #1 because it would simplify the test runner code and no one seems to be using the functionality anyway. It also appears from reading the code that any tests run in a subprocess (and most are) will never exercise the tear-down mechanism anyway.

+1 in general but -1 on removing the tear down functionality. We use it to destroy external databases that where generated for setup.

Ah! Good point.

...

...
#2 will add some process start-up overhead, but it'll only be one more process than is already started (and any reasonably-sized test corpus already starts several processes for each test run). The one exception is for running with -D or with a pdb.set_trace() embedded in the code under test. For that case we need a switch to say "don't start any subprocesses at all", I suspect that will be spelled -j0.

+1 as well. I'm actually wondering whether we might be able to control the pdb through a sub-process.

I don't think it'd be that hard, in general, but the current design of using stdout and stderr for IPC communication channels is a hindrance.

...

...
For motivation, some speed comparisons: running a particular test suite with 3876 tests (mostly doctests, and mostly functional) without the patch takes 6 minutes, 42 seconds; my branch runs the same tests in 3 minutes and 22 seconds (give or take) on a dual-core box with 3 simultaneous subprocesses.

Yay!

I have an 8 core machine that I can't wait to try it on. ;) -- Benji York Senior Software Engineer Zope Corporation

Dieter Maurer

13 Jul 13 Jul

6:04 a.m.

New subject: [Zope-dev] Test runner: layers, subprocesses, and tear down

Benji York wrote at 2008-7-3 17:44 -0400:

...

On Thu, Jul 3, 2008 at 5:37 PM, Christian Theune <ct@gocept.com> wrote:

...
On Thu, 2008-07-03 at 17:22 -0400, Benji York wrote:

...
I'd like to 1) remove the layer tear-down mechanism entirely, and 2) make (almost) all layers run in a subprocess.

You are aware that layers can be nested? The implication of this is that a sublayer (run in a subprocess) either must start from scratch and reconstruct the fixture built in the superlayer (potentially expensive) or must access the resources inherited from the forking process. The latter (accessing resources inherited from the forking process) is very brittle. I had to give it up in a different context. -- Dieter

Roger Ineichen

3 Jul 3 Jul

9:42 p.m.

New subject: AW: [Zope-dev] Test runner: layers, subprocesses, and tear down

Hi Benji

...

Betreff: [Zope-dev] Test runner: layers, subprocesses, and tear down

[... ]

...

#2 will add some process start-up overhead, but it'll only be one more process than is already started (and any reasonably-sized test corpus already starts several processes for each test run). The one exception is for running with -D or with a pdb.set_trace() embedded in the code under test. For that case we need a switch to say "don't start any subprocesses at all", I suspect that will be spelled -j0.

That's a very important point. I often use pdb if I write tests.

...

For motivation, some speed comparisons: running a particular test suite with 3876 tests (mostly doctests, and mostly functional) without the patch takes 6 minutes, 42 seconds; my branch runs the same tests in 3 minutes and 22 seconds (give or take) on a dual-core box with 3 simultaneous subprocesses.

Yeah, great! Regards Roger Ineichen

...

Benji York Senior Software Engineer Zope Corporation

Marius Gedminas

10:50 p.m.

New subject: [Zope-dev] Test runner: layers, subprocesses, and tear down

On Thu, Jul 03, 2008 at 05:22:11PM -0400, Benji York wrote:

...

I'm working on making the zope.testing test runner run tests in parallelized subprocesses. The option will likely be spelled -j N, where N is the maximum number of processes.

That's wonderful news!

...

I have it basically working, but have noticed a couple odd corners of the test runner that I'd like to clean up. They may be controversial, so I'll ask about them here first.

I'd like to 1) remove the layer tear-down mechanism entirely, and 2) make (almost) all layers run in a subprocess.

-1 in general. +1 if you do that only for -j N where N > 1. Running all the tests in a single process has the following benefits: * test coverage analysis produces results that are correct (well, correct often enough -- but it has no chance at all when the test runner forks a subprocess) * import pdb; pdb.set_trace() works

...

I want to do #1 because it would simplify the test runner code and no one seems to be using the functionality anyway.

That's news to me. A while ago I went through Zope 3 trunk (it was pre-eggsplosion IIRC) and made sure all test layers defined in it supported teardown. Granted, FunctionalTestLayer() has allow_teardown=False as the default, for two reasons: * backwards compatibility: in olden days functional test layers didn't support teardown * paranoia: it is in general impossible to determine whether calling CleanUp().cleanUp() will correctly clear all the global state (someone could easily write a custom ZCML directive that changed a global variable and forget to register a CleanUp hook), so disallowing teardowns were the conservative safe choice. It is entirely my fault that I haven't evangelized the allow_teardown=True option for creating new test layers.

...

It also appears from reading the code that any tests run in a subprocess (and most are) will never exercise the tear-down mechanism anyway.

I guess that's fine for process state, but not so fine for external state (temporary files etc.). Hey, this might explain why SchoolTool's tests tend to fill up my buildbot's /tmp without cleaning up after themselves! I'll have to investigate some day.

...

#2 will add some process start-up overhead, but it'll only be one more process than is already started (and any reasonably-sized test corpus already starts several processes for each test run). The one exception is for running with -D or with a pdb.set_trace() embedded in the code under test. For that case we need a switch to say "don't start any subprocesses at all", I suspect that will be spelled -j0.

If that case needs to be supported anyway, what's the advantage of spawning exactly one subprocess when you run it with -j 1? I would also question whether pdb-unfriendly non-performance-enhancing option should be the default.

...

For motivation, some speed comparisons: running a particular test suite with 3876 tests (mostly doctests, and mostly functional) without the patch takes 6 minutes, 42 seconds; my branch runs the same tests in 3 minutes and 22 seconds (give or take) on a dual-core box with 3 simultaneous subprocesses.

I know; for large test suites (by "large" I mean 40 minutes) I've been using an ugly hack (--odd/--even test filtering) that lets me use both CPUs if I manually run two instances of the test runner in two xterms in parallel. Regards, Marius Gedminas -- "Wipe Info uses hexadecimal values to wipe files. This provides more security than wiping with decimal values." -- Norton SystemWorks 2002 Manual

Martin Aspeli

11:56 p.m.

New subject: [Zope-dev] Re: Test runner: layers, subprocesses, and tear down

Benji York wrote:

...

I'm working on making the zope.testing test runner run tests in parallelized subprocesses.

I have some recent experience parallelising (and distributing across machines) test runs. This was in Java, with TestNG and Selenium, but we learned some interesting things. We basically cut a 45 minute test run to 10 minutes by distributing the tests across three machines, each running a full stack (Oracle, JBoss, Firefox) and Selenium Grid. I realise you're not trying to do anything quite as complex as that, but a parallel test runner ought to be extensible to support distribution across nodes in a grid. The main challenge there is to distribute deployment of the code to run, and to sync test setup so that all environments are identical. I suspect you'll find this out of scope to begin with, but I'd keep it in the back of your mind. You will likely need some way of declaring tests that have to run in series. Sometimes that's just for sanity's sake, other times it's a requirement due to shared resources. A nice way to do this is to make it possible to annotate tests to group them, and then to be able to declaratively configure some groups as serial. Any functional test that uses a shared external resource will require this. TestNG supports (as far as I recall): - Run all tests (methods) randomly and parallelise - Run groups of tests (classes or declaratively specified named groups) in parallel, but run tests within the groups sequentially - Run all tests in series (i.e. single-threaded) We should probably use test layers as the main grouping mechanism here. If you could declare a layer as "can be run in parallel with other layers" or "tests in this layer run in series", that'd be pretty powerful. I'm not 100% sure how this works with layers that derive from one another, and where you'd have two layers with a shared base class, though. Parallisation can offer huge (!) speed increases, but it can also be hard to debug tests. I'd be tempted to let single threaded by the default, safe choice, and let people opt into parallisation only when they know what they are doing. Most test runs are quite quick anyway. Test result reporting can be difficult. You'll probably need to collect all failures with tracebacks and report at the end. For long running test suites, this may not be ideal, since it's helpful to get early warning, so if you can find a way to get test output to be atomically output, then that'd be nice. Debugging stuff that happens in parallel with pdb is also tricky. It must be easy to turn off parallel running and to run individual tests in a single process for each debugging. To make this work with Selenium grid, we ended up building some infrastructure to manage environments (i.e. an allocation of database, web server and so on), and locks on those environments. We'd spawn one thread for each environment and feed tests to those threads as fast as they could run them. Each test run then grabbed an environment on setup, executed, and then released the lock for another test. Oh, and please don't get rid of any tear-down. You'll definitely need it one day. Letting environments go dirty is generally troublesome, and gets only more difficult when you may have multiple threads trying to use those environments at once. I don't know how you've structured this, but I'd consider whether one layer could be shared across multiple threads/subprocesses, or if it's always a one-to-one thing. I realise this is somewhat rambling, but I hope it's useful in any case. :) Cheers, Martin -- Author of `Professional Plone Development`, a book for developers who want to work with Plone. See http://martinaspeli.net/plone-book

Dieter Maurer

13 Jul 13 Jul

6:10 a.m.

New subject: [Zope-dev] Re: Test runner: layers, subprocesses, and tear down

Martin Aspeli wrote at 2008-7-4 00:56 +0100:

...

.... Benji York wrote: .... Parallisation can offer huge (!) speed increases, but it can also be hard to debug tests. I'd be tempted to let single threaded by the default, safe choice, and let people opt into parallisation only when they know what they are doing. Most test runs are quite quick anyway.

...

... Oh, and please don't get rid of any tear-down.

+1 -- Dieter

Adam GROSZER

4 Jul 4 Jul

7:02 a.m.

New subject: [Zope-dev] Test runner: layers, subprocesses, and tear down

Hello Benji, +1 for keeping the default as no subprocess and keeping the teardown. The others already said the reasons. -- Best regards, Adam GROSZER mailto:agroszer@gmail.com -- Quote of the day: It is a great mistake to suppose that God is only, or even chiefly, concerned with religion. - William Temple, Archbishop of Canterbury

Benji York

3:50 p.m.

New subject: [Zope-dev] Re: Test runner: layers, subprocesses, and tear down

On Thu, Jul 3, 2008 at 5:22 PM, Benji York <benji@zope.com> wrote:

...

I'm working on making the zope.testing test runner run tests in parallelized subprocesses. The option will likely be spelled -j N, where N is the maximum number of processes.

The branch (svn+ssh://svn.zope.org/repos/main/zope.testing/branches/benji-parallelize-subprocesses) is feature complete. I basically did a very simple transformation that resulted in the runner spawning subprocesses in threads, several at a time, instead of spawning them serially. The patch is less than 250 lines. Any critiques of the changes would be appreciated. No changes to the tear-down mechanism were made. All existing tests pass without modification. There aren't yet any tests for the new functionality. It may be tricky to test, so I have to think about that bit. I found that to eliminate nearly all CPU idle time, I had to use -j4 on my two core laptop. For a particular test corpus on a 4 core machine -j1 (the default) takes about 7 minutes -j6 takes about 2 minutes 20 seconds. If you use zc.buildout, then you can try the branch by checking it out, adding a "develop" entry into your buildout config referencing it, and updating any version spec for zope.testing to "3.6dev". I'd really like third-party confirmation of the total test time reductions I've seen. -- Benji York Senior Software Engineer Zope Corporation

Hanno Schlichting

5:12 p.m.

New subject: [Zope-dev] Re: Test runner: layers, subprocesses, and tear down

Hi. Benji York wrote:

...

If you use zc.buildout, then you can try the branch by checking it out, adding a "develop" entry into your buildout config referencing it, and updating any version spec for zope.testing to "3.6dev". I'd really like third-party confirmation of the total test time reductions I've seen.

As this sounds very cool, I couldn't help but try it. I do get an error, though: Ran 80 tests with 2 failures and 0 errors in 37.463 seconds. Exception in thread Thread-1: Traceback (most recent call last): File "/opt/local/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/threading.py", line 442, in __bootstrap self.run() File "/opt/local/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/threading.py", line 422, in run self.__target(*self.__args, **self.__kwargs) File "/opt/tmp/zope.testing/src/zope/testing/testrunner/runner.py", line 419, in spawn_layer_in_subprocess raise SubprocessError( SubprocessError: No subprocess summary found: Error: option --resume-layer not recognized This is inside a test collection in the Plone land, so there might be something on the various other layers involved here, that causes this. But maybe you have an idea what this is about? Hanno

Marius Gedminas

8:49 p.m.

New subject: [Zope-dev] Re: Test runner: layers, subprocesses, and tear down

On Fri, Jul 04, 2008 at 11:50:34AM -0400, Benji York wrote:

...

On Thu, Jul 3, 2008 at 5:22 PM, Benji York <benji@zope.com> wrote:

...
I'm working on making the zope.testing test runner run tests in parallelized subprocesses. The option will likely be spelled -j N, where N is the maximum number of processes.

The branch (svn+ssh://svn.zope.org/repos/main/zope.testing/branches/benji-parallelize-subprocesses) is feature complete. I basically did a very simple transformation that resulted in the runner spawning subprocesses in threads, several at a time, instead of spawning them serially. The patch is less than 250 lines. Any critiques of the changes would be appreciated.

I'll try to take a look.

...

I found that to eliminate nearly all CPU idle time, I had to use -j4 on my two core laptop.

For a particular test corpus on a 4 core machine -j1 (the default) takes about 7 minutes -j6 takes about 2 minutes 20 seconds.

I tried this in a Zope 3.4 checkout I had handy on a Core 2 Duo machine (1.8 GHz, running 64-bit Ubuntu Hardy). One test module could not be loaded, which explains the slightly lower number of tests reported: Test-module import failures: Module: zope.app.twisted.ftp.tests.test_zopetrial Traceback (most recent call last): File "/home/mg/src/ivija-zope-3.4/Zope3.4/src/zope/app/twisted/ftp/tests/test_zopetrial.py", line 37, in ? orig_configure_logging = zope.testing.testrunner.configure_logging AttributeError: 'module' object has no attribute 'configure_logging' Here are the results: time # tests real user system reported old test runner 3m16.033s 2m44.670s 0m2.832s 6895 zope.testing trunk 2m27.816s 1m58.971s 0m2.196s 6890 new test runner -j0 2m37.322s 2m5.808s 0m2.944s 6890 new test runner -j1 2m32.249s 1m58.847s 0m2.652s 6890 new test runner -j2 2m22.287s 3m51.214s 0m13.457s 584 new test runner -j3 2m20.560s 3m46.990s 0m12.613s 584 new test runner -j4 2m30.026s 3m43.198s 0m13.241s 584 At the end of the experiment I discovered that I have CPU frequency scaling enabled. It only scales down to 1.6 GHz and quickly jumps back up to 1.87 GHz. I find the speedup by switching to a modern test runner somewhat unexpected. Can those 5 missing tests really account for 45 seconds? Zope 3 appears to be composed of a multitude of small tests. If my numbers are correct, the advantage of using both CPU cores is almost completely negated by the extra bookkeeping that the test runner has to do. Visual ogling of my CPU usage applet shows that -j0/1 use only one CPU, while -j2 and above use only one CPU for the first test layer (zope.app.apidoc.testing.APIDocLayer) and then use both CPUs for the rest. Bug? The total number of tests is misreported when you have -jN with N > 1. "Test-module import failures" is printed several times. test -j4 printed that message 37 times! test -j1 only did it once. -j2 and -j3 also did that a bit often (once per layer?) As far as I can understand, the granularity of the test distribution to CPUs is a test layer? If so, that's rather unfortunate for my application, which has only two layers (unit and functional). Especially given the quirk that the first test layer is run on one CPU while the other idles. Marius Gedminas P.S. Zope 3 is such a sweet little thing! All the tests finish in 3 minutes! Heaven. -- The planning fallacy is that people think they can plan, ha ha. -- Eliezer Yudkowsky, http://www.overcomingbias.com/2007/09/planning-fallac.html

Benji York

9:44 p.m.

New subject: [Zope-dev] Re: Test runner: layers, subprocesses, and tear down

On Fri, Jul 4, 2008 at 4:49 PM, Marius Gedminas <marius@gedmin.as> wrote:

...

I tried this in a Zope 3.4 checkout I had handy on a Core 2 Duo machine (1.8 GHz, running 64-bit Ubuntu Hardy). One test module could not be loaded, which explains the slightly lower number of tests reported:

...

Here are the results:

time # tests real user system reported old test runner 3m16.033s 2m44.670s 0m2.832s 6895 zope.testing trunk 2m27.816s 1m58.971s 0m2.196s 6890 new test runner -j0 2m37.322s 2m5.808s 0m2.944s 6890 new test runner -j1 2m32.249s 1m58.847s 0m2.652s 6890 new test runner -j2 2m22.287s 3m51.214s 0m13.457s 584 new test runner -j3 2m20.560s 3m46.990s 0m12.613s 584 new test runner -j4 2m30.026s 3m43.198s 0m13.241s 584

I'm really curious why you didn't see more improvement.

...

At the end of the experiment I discovered that I have CPU frequency scaling enabled. It only scales down to 1.6 GHz and quickly jumps back up to 1.87 GHz.

I find the speedup by switching to a modern test runner somewhat unexpected. Can those 5 missing tests really account for 45 seconds?

Zope 3 appears to be composed of a multitude of small tests. If my numbers are correct, the advantage of using both CPU cores is almost completely negated by the extra bookkeeping that the test runner has to do.

There's no appreciable bookkeeping for the parallelization, so I don't know where the CPU time is going.

...

Visual ogling of my CPU usage applet shows that -j0/1 use only one CPU, while -j2 and above use only one CPU for the first test layer (zope.app.apidoc.testing.APIDocLayer) and then use both CPUs for the rest. Bug?

Long story short: it made the changes to the code much less invasive to do that way.

...

The total number of tests is misreported when you have -jN with N > 1.

I haven't seen that symptom, but I'll try to reproduce it by running the 3.4 tests.

...

"Test-module import failures" is printed several times. test -j4 printed that message 37 times! test -j1 only did it once. -j2 and -j3 also did that a bit often (once per layer?)

Interesting. I'll investigate.

...

As far as I can understand, the granularity of the test distribution to CPUs is a test layer?

Right.

...

If so, that's rather unfortunate for my application, which has only two layers (unit and functional). Especially given the quirk that the first test layer is run on one CPU while the other idles.

If the need is great enough, you can always introduce an arbitrary number of layers. Also, once this code is working properly, I (or someone else) might look into changing the granularity to the level of individual tests. -- Benji York Senior Software Engineer Zope Corporation

Marius Gedminas

10:43 p.m.

New subject: [Zope-dev] Re: Test runner: layers, subprocesses, and tear down

On Fri, Jul 04, 2008 at 05:44:12PM -0400, Benji York wrote:

...

On Fri, Jul 4, 2008 at 4:49 PM, Marius Gedminas <marius@gedmin.as> wrote:

...
I tried this in a Zope 3.4 checkout I had handy on a Core 2 Duo machine (1.8 GHz, running 64-bit Ubuntu Hardy). One test module could not be loaded, which explains the slightly lower number of tests reported:

...
Here are the results:

time # tests real user system reported old test runner 3m16.033s 2m44.670s 0m2.832s 6895 zope.testing trunk 2m27.816s 1m58.971s 0m2.196s 6890 new test runner -j0 2m37.322s 2m5.808s 0m2.944s 6890 new test runner -j1 2m32.249s 1m58.847s 0m2.652s 6890 new test runner -j2 2m22.287s 3m51.214s 0m13.457s 584 new test runner -j3 2m20.560s 3m46.990s 0m12.613s 584 new test runner -j4 2m30.026s 3m43.198s 0m13.241s 584

I'm really curious why you didn't see more improvement.

I wish one of the system-wide profilers (oprofile, sysprof) had support for extracting Python tracebacks out of C-level stack frames...

...

...
Zope 3 appears to be composed of a multitude of small tests. If my numbers are correct, the advantage of using both CPU cores is almost completely negated by the extra bookkeeping that the test runner has to do.

There's no appreciable bookkeeping for the parallelization, so I don't know where the CPU time is going.

Every layer is spawned in a separate subprocess, right? That means 36 new Python processes with the associated startup cost, plus the module import cost, plus some test result marshalling through plain-text Unix pipes. Two seconds of startup cost per subprocess would nicely account for the one extra minute of user time if there are over 30 subprocesses. My crude measurements (time ./test.py --list-tests > /dev/null) indicate the time needed to import everything is closer to 4 seconds, but that's importing everything -- importing just the things needed for a single layer may reduce that to two seconds on average.

...

...
"Test-module import failures" is printed several times. test -j4 printed that message 37 times! test -j1 only did it once. -j2 and -j3 also did that a bit often (once per layer?)

Interesting. I'll investigate.

It corroborates my theory that each subprocess imports all the test modules. Marius Gedminas -- H.323 has much in common with other ITU-T standards - it features a complex binary wire protocol, a nightmarish implementation, and a bulk that can be used to fell medium-to-large predatory animals. -- Anthony Baxter

Benji York

5 Jul 5 Jul

12:24 a.m.

New subject: [Zope-dev] Re: Test runner: layers, subprocesses, and tear down

On Fri, Jul 4, 2008 at 6:43 PM, Marius Gedminas <marius@gedmin.as> wrote:

...

On Fri, Jul 04, 2008 at 05:44:12PM -0400, Benji York wrote:

...

...
There's no appreciable bookkeeping for the parallelization, so I don't know where the CPU time is going.

Every layer is spawned in a separate subprocess, right? That means 36 new Python processes with the associated startup cost, plus the module import cost, plus some test result marshalling through plain-text Unix pipes. Two seconds of startup cost per subprocess would nicely account for the one extra minute of user time if there are over 30 subprocesses.

The number of subprocesses is the same as for the trunk, the only change is that they can be spawned in parallel. Wait! The Zope 3 test layers can all be torn down! Therefore there aren't *any* subprocesses spawned normally. Ok, that makes more sense. (time passes) OK, I did a check out of the Zope 3 trunk and was able to duplicate your results. (And wow, the trunk seems to be in bad shape -- lots of tests failing. I guess it's fallen into disrepair since being broken out into subprojects.)

...

My crude measurements (time ./test.py --list-tests > /dev/null) indicate the time needed to import everything is closer to 4 seconds, but that's importing everything -- importing just the things needed for a single layer may reduce that to two seconds on average.

A possible enhancement would be to reuse subprocesses if they are asked to run layers that can be torn down. That way for a very tear-down friendly set of tests like Zope 3's, the minimum number of processes would be started. We could also use fork to eliminate some of the start-up cost, but that's not real attractive, being un-Windows-friendly. -- Benji York Senior Software Engineer Zope Corporation

Benji York

1:38 a.m.

New subject: [Zope-dev] Re: Test runner: layers, subprocesses, and tear down

On Fri, Jul 4, 2008 at 4:49 PM, Marius Gedminas <marius@gedmin.as> wrote:

...

Here are the results:

time # tests real user system reported old test runner 3m16.033s 2m44.670s 0m2.832s 6895 zope.testing trunk 2m27.816s 1m58.971s 0m2.196s 6890 new test runner -j0 2m37.322s 2m5.808s 0m2.944s 6890 new test runner -j1 2m32.249s 1m58.847s 0m2.652s 6890 new test runner -j2 2m22.287s 3m51.214s 0m13.457s 584 new test runner -j3 2m20.560s 3m46.990s 0m12.613s 584 new test runner -j4 2m30.026s 3m43.198s 0m13.241s 584

I figured out why the total test count went down so much for -j2: the branch inherited a bug from the trunk that skips the unit test "layer" if it is run in a subprocess. I'll take a crack at fixing that. -- Benji York Senior Software Engineer Zope Corporation

Chris Withers

4 Jul 4 Jul

6:24 p.m.

New subject: [Zope-dev] Test runner: layers, subprocesses, and tear down

Hi Benji, I've read the whole thread to date but thought I'd reply here... Benji York wrote:

...

I'm working on making the zope.testing test runner run tests in parallelized subprocesses. The option will likely be spelled -j N, where N is the maximum number of processes.

Cool :-) But please defauult to 1 for backwards compatability...

...

I'd like to 1) remove the layer tear-down mechanism entirely, and 2) make (almost) all layers run in a subprocess.

-lots to both of these I'm afraid. I've used tear-downs extensively for everything from shutting database connections to aborting transactions to dumping DemoStorages. I'm sure I'm not the only one.. As for all layers in a sub-process, I worry that this would break existing tests in some kind of horrible nasty way...

...

I want to do #1 because it would simplify the test runner code and no one seems to be using the functionality anyway. It also appears from reading the code that any tests run in a subprocess (and most are) will never exercise the tear-down mechanism anyway.

So I guess we're not currently running tests in a sub-process? My take on the pre-refactor testrunner was that a sub-process was only used when the testrunner was testing itself?

...

#2 will add some process start-up overhead, but it'll only be one more process than is already started (and any reasonably-sized test corpus already starts several processes for each test run). The one exception is for running with -D or with a pdb.set_trace() embedded in the code under test. For that case we need a switch to say "don't start any subprocesses at all", I suspect that will be spelled -j0.

Right, I use this a lot. I guess -j0 should be the default for backwards compatability? cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk

Christian Theune

5 Jul 5 Jul

7:18 a.m.

New subject: [Zope-dev] Test runner: layers, subprocesses, and tear down

Hi, On Thu, 2008-07-03 at 17:22 -0400, Benji York wrote:

...

I'm working on making the zope.testing test runner run tests in parallelized subprocesses. The option will likely be spelled -j N, where N is the maximum number of processes.

Getting back to the idea about parallelizing on a per-test base and not per-layer: The ZODB currently runs only unit tests (which became a true layer in zope.testing/trunk) but takes about XX minutes on one of my machines (4 core XEON, 3.2 GHz). I'd suggest that the general principle of splitting up the runs over multiple parallel processes should happen in a way that if you have X total tests and N parallel processes, we should have roughly X/N tests run in parallel. We could use layers as a hint to create subprocesses, but should split up layers if they are too large to fit the X/N rule (maybe with a margin of a few percent to avoid splits for single or few tests). Christian -- Christian Theune · ct@gocept.com gocept gmbh & co. kg · forsterstraße 29 · 06112 halle (saale) · germany http://gocept.com · tel +49 345 1229889 7 · fax +49 345 1229889 1 Zope and Plone consulting and development

Christian Theune

7:25 a.m.

New subject: [Zope-dev] Test runner: layers, subprocesses, and tear down

On Sat, 2008-07-05 at 09:18 +0200, Christian Theune wrote:

...

Hi,

On Thu, 2008-07-03 at 17:22 -0400, Benji York wrote:

...
I'm working on making the zope.testing test runner run tests in parallelized subprocesses. The option will likely be spelled -j N, where N is the maximum number of processes.

Getting back to the idea about parallelizing on a per-test base and not per-layer:

The ZODB currently runs only unit tests (which became a true layer in zope.testing/trunk) but takes about XX minutes on one of my machines (4 core XEON, 3.2 GHz).

The actual numbers are: Ran 2816 tests with 0 failures and 0 errors in 14 minutes 34.292 seconds. real 14m36.099s user 3m44.740s sys 0m43.170s -- Christian Theune · ct@gocept.com gocept gmbh & co. kg · forsterstraße 29 · 06112 halle (saale) · germany http://gocept.com · tel +49 345 1229889 7 · fax +49 345 1229889 1 Zope and Plone consulting and development

Benji York

8:12 p.m.

New subject: [Zope-dev] Test runner: layers, subprocesses, and tear down

On Sat, Jul 5, 2008 at 3:18 AM, Christian Theune <ct@gocept.com> wrote:

...

We could use layers as a hint to create subprocesses, but should split up layers if they are too large to fit the X/N rule (maybe with a margin of a few percent to avoid splits for single or few tests).

It probably wouldn't be too hard to automatically break "large" layers into several small layers. It's also not hard for people with very large layers that care about parallel execution time to break them up themselves. I'm not opposed to automatic layer segmentation (as long as it's implemented well), but also don't think it's all that important. -- Benji York Senior Software Engineer Zope Corporation

6499

Age (days ago)

6509

Last active (days ago)

List overview

19 comments

9 participants

participants (9)

Adam GROSZER
Benji York
Chris Withers
Christian Theune
Dieter Maurer
Hanno Schlichting
Marius Gedminas
Martin Aspeli
Roger Ineichen