coverage.py, profile and hotshot support in Zope's testrunner

older
ZServer response RFC compliance...

Stefane Fermigier

21 Aug 2005 21 Aug '05

8:20 a.m.

I have added support for coverage analysis using coverage.py from Gareth Rees and Ned Batchelder, as well as support for profiling using either the profile or hotshot modules from the Python standard library, to Zope 2's test runner test.py. Justification: - coverage.py will let you interactively focus on the modules you want to check for coverage. - profile is slow to collect data but quick for analysis. - hotshot is fast when collecting but very slow for analysis. On the other hand, one can feed the great KCacheGrind tool with hotshot data after some transformation and this is the best way I have found so far for interactively exploring profile data. Since I'm not a regular Zope commiter, I'd like to ask for comments and permission first before commiting. The test.py file is here: http://blogs.nuxeo.com/sections/blogs/fermigier/2005_08_21_coverage-py-profi... BTW: the patch is absolutely trivial (see attached file). S. -- Stéfane Fermigier, Tel: +33 (0)6 63 04 12 77 (mobile). Nuxeo Collaborative Portal Server: http://www.nuxeo.com/cps Gestion de contenu web / portail collaboratif / groupware / open source! 1c1 < #!/home/fermigier/bin/python ---

...

#!/usr/bin/env python2.3 124,143d123 < --coverage < Use the coverage.py module from Gareth Rees < (http://www.nedbatchelder.com/code/modules/coverage.html) to collect data < for code coverage. This will output trace data in a file called < '.coverage'. You will need to call coverage.py after the test run to < analyse the data (-r for a line count report, -a for annotated file). < < --profile < Use the profile module from the standard library to collect profiling data. < This will output data in a file called '.profile'. You will need to use the < pstats module from the standard library after the test run to analyse the < data. < < --hotshot < Use the hotshot module from the standard library to collect profiling data. < This will output data in a file called '.hotshot'. You will need to use the < hotshot.pstats module from the standard library after the test run to < analyse the data. You may also use the hotshot2cg script from the < KCacheGrind sources to create data suitable for analysis by KCacheGrind. < 772,774d751 < COVERAGE = False < PROFILE = False < HOTSHOT = False 797,798c774 < "config-file=", "import-testing", < "coverage", "profile", "hotshot"])

...

"config-file=", "import-testing"])

850,855d825 < elif k == "--coverage": < COVERAGE = True < elif k == "--profile": < PROFILE = True < elif k == "--hotshot": < HOTSHOT = True 936,958d905 < < elif COVERAGE: < try: < from coverage import the_coverage < except: < print "You need to install coverage.py from " < print "http://www.nedbatchelder.com/code/modules/coverage.html" < sys.exit() < the_coverage.start() < main(module_filter, test_filter, libdir) < < elif PROFILE: < import profile < profile.runctx("main(module_filter, test_filter, libdir)", < globals=globals(), locals=vars(), < filename=".profile") < < elif HOTSHOT: < import hotshot < profile = hotshot.Profile(".hotshot") < profile.runctx("main(module_filter, test_filter, libdir)", < globals=globals(), locals=vars()) <

Attachments:

sf.vcf (text/x-vcard — 307 bytes)

Show replies by date

Andreas Jung

21 Aug 21 Aug

8:35 a.m.

New subject: [Zope-dev] coverage.py, profile and hotshot support in Zope's testrunner

--On 21. August 2005 10:20:28 +0200 Stefane Fermigier <sf@nuxeo.com> wrote:

...

Since I'm not a regular Zope commiter, I'd like to ask for comments and permission first before commiting.

The test.py file is here: http://blogs.nuxeo.com/sections/blogs/fermigier/2005_08_21_coverage-py-pr ofile

Looks fine to me. At least put it on the trunk. If others find it useful as well, put it on the 2.8 branch. Andreas

Sidnei da Silva

1:44 p.m.

New subject: [Zope-dev] coverage.py, profile and hotshot support in Zope's testrunner

On Sun, Aug 21, 2005 at 10:20:28AM +0200, Stefane Fermigier wrote: | I have added support for coverage analysis using coverage.py from Gareth | Rees and Ned Batchelder, as well as support for profiling using either the | profile or hotshot modules from the Python standard library, to Zope 2's | test runner test.py. | | Justification: | | - coverage.py will let you interactively focus on the modules you want to | check for coverage. | | - profile is slow to collect data but quick for analysis. | | - hotshot is fast when collecting but very slow for analysis. On the other | hand, one can feed the great KCacheGrind tool with hotshot data after some | transformation and this is the best way I have found so far for | interactively exploring profile data. Eerm, the test runner in Zope 2 is the same from Zope 3. The version in Zope 3 X3.0 has profile support using hotshot and coverage support using the trace.py module, but the version in Zope 2 is earlier than that. How does that differ from what you're proposing? -- Sidnei da Silva Enfold Systems, LLC. http://enfoldsystems.com

Stefane Fermigier

7:07 p.m.

New subject: [Zope-dev] coverage.py, profile and hotshot support in Zope's testrunner

Sidnei da Silva wrote:

...

On Sun, Aug 21, 2005 at 10:20:28AM +0200, Stefane Fermigier wrote: | I have added support for coverage analysis using coverage.py from Gareth | Rees and Ned Batchelder, as well as support for profiling using either the | profile or hotshot modules from the Python standard library, to Zope 2's | test runner test.py. | | Justification: | | - coverage.py will let you interactively focus on the modules you want to | check for coverage. | | - profile is slow to collect data but quick for analysis. | | - hotshot is fast when collecting but very slow for analysis. On the other | hand, one can feed the great KCacheGrind tool with hotshot data after some | transformation and this is the best way I have found so far for | interactively exploring profile data.

Eerm, the test runner in Zope 2 is the same from Zope 3. The version in Zope 3 X3.0 has profile support using hotshot and coverage support using the trace.py module, but the version in Zope 2 is earlier than that.

How does that differ from what you're proposing?

1. I haven't found hotshot support in either Zope 2.8.1 nor the TRUNK. 2. Has I wrote, coverage.py can be used interactively (at least, from the command line) to focus on whichever package you are working on at the moment. You can also collect data from several runs, which is useful for us because we have to test each CPS package in a different run. 3. I understand that trace.py and coverage.py have some overlap, and should probably be merged into one great and up to date coverage tool. But before it is done, I find useful to have both tools at our disposal. 4. Same for hotshot and profile. Both are useful. S. -- Stéfane Fermigier, Tel: +33 (0)6 63 04 12 77 (mobile). Nuxeo Collaborative Portal Server: http://www.nuxeo.com/cps Gestion de contenu web / portail collaboratif / groupware / open source!

Jim Fulton

22 Aug 22 Aug

7:01 p.m.

New subject: [Zope-dev] coverage.py, profile and hotshot support in Zope's testrunner

Stefane Fermigier wrote:

...

Sidnei da Silva wrote:

...
On Sun, Aug 21, 2005 at 10:20:28AM +0200, Stefane Fermigier wrote: | I have added support for coverage analysis using coverage.py from Gareth | Rees and Ned Batchelder, as well as support for profiling using either the | profile or hotshot modules from the Python standard library, to Zope 2's | test runner test.py. | | Justification: | | - coverage.py will let you interactively focus on the modules you want to | check for coverage. | | - profile is slow to collect data but quick for analysis. | | - hotshot is fast when collecting but very slow for analysis. On the other | hand, one can feed the great KCacheGrind tool with hotshot data after some | transformation and this is the best way I have found so far for | interactively exploring profile data.

Eerm, the test runner in Zope 2 is the same from Zope 3. The version in Zope 3 X3.0 has profile support using hotshot and coverage support using the trace.py module, but the version in Zope 2 is earlier than that.

How does that differ from what you're proposing?

1. I haven't found hotshot support in either Zope 2.8.1 nor the TRUNK.

2. Has I wrote, coverage.py can be used interactively (at least, from the command line) to focus on whichever package you are working on at the moment. You can also collect data from several runs, which is useful for us because we have to test each CPS package in a different run.

3. I understand that trace.py and coverage.py have some overlap, and should probably be merged into one great and up to date coverage tool. But before it is done, I find useful to have both tools at our disposal.

4. Same for hotshot and profile. Both are useful.

I don't think Sidnei was questioning the value of these, but just pointing out that another version of the test runner already had some of this. I'll note that I'm working on a newer test runner that I hope to use in Zope 2.9 and 3.2. The new test runner is a nearly complete rewrite to provide: - A more flexible test runner that can be used for a variety of projects. The current test runner has been forked for ZODB, Zope 3, and Zope 2. That's why the Zope 3 version has features that are lacking in the Zope 2 version. - Support for "layers" of tests, so that it can handle unit tests and functional tests. - A slightly better UI. - Tests (of the test runner itself :) See: http://svn.zope.org/zope.testing/trunk/src/zope/testing/testrunner.txt?view=... http://svn.zope.org/zope.testing/trunk/src/zope/testing/testrunner.py?view=l... I'd like to include coverage and profiling support. (There is partial support now, but untested.) Wanna help? Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org

Stuart Bishop

23 Aug 23 Aug

5:38 a.m.

New subject: [Zope-dev] Re: New test runner work

Jim Fulton wrote:

...

I'll note that I'm working on a newer test runner that I hope to use in Zope 2.9 and 3.2. The new test runner is a nearly complete rewrite to provide:

- A more flexible test runner that can be used for a variety of projects. The current test runner has been forked for ZODB, Zope 3, and Zope 2. That's why the Zope 3 version has features that are lacking in the Zope 2 version.

- Support for "layers" of tests, so that it can handle unit tests and functional tests.

- A slightly better UI.

- Tests (of the test runner itself :)

See:

http://svn.zope.org/zope.testing/trunk/src/zope/testing/testrunner.txt?view=...

http://svn.zope.org/zope.testing/trunk/src/zope/testing/testrunner.py?view=l...

Hi Jim. I've been looking over this - fixing tests seems to take up a significant amount of our time, so I might have some interesting use cases. A large proportion of our tests use a relational database. Some of them want an empty database, some of them want just the schema created but no data, some of them want the schema created and the data. Some of them need the component architecture, and some of them don't. Some of them need one or more twisted servers running, some of them don't. Note that we mix and match. We have 4 different types of database fixture (none, empty, schema, populated), 2 different types of database connection mechanisms (psycopgda, psycopg), 2 types of CA fixture (none, loaded), and (currently) 4 states of external daemons needed. If we were to arrange this in layers, it would take 56 different layers, and this will double every time we add a new daemon, or add more database templates (e.g. fat for lots of sample data to go with the existing thin). As a way of supporting this better, instead of specifying a layer a test could specify the list of resources it needs: import testresources as r class FooTest(unittest.TestCase): resources = [r.LaunchpadDb, r.Librarian, r.Component] [...] class BarTest(unittest.TestCase): resources = [r.EmptyDb] class BazTest(unittest.TestCase): resources = [r.LaunchpadDb, r.Librarian] The resources are pretty much identical to the current layers, in that (after the test runner does some sorting fu), the run order can be optimized to avoid setting up and tearing down resources unnecessarily. This would be a big win for us - currently, we specify 'resources' by simply calling various setup and teardown methods in the test case: class FooTest(unittest.TestCase): def setUp(self): LaunchpadTestSetup().setUp() LibrarianTestSetup().setUp() FunctionalTestSetup().setUp() def tearDown(self): FunctionalTestSetup().tearDown() LibrarianTestSetup().tearDown() LaunchpadTestSetup().tearDown() Some other nice things could be done with the resources: - If the setUp raises NotImplementedError (or whatever), tests using this resource are skipped (and reported as skipped). This nicely handles tests that should only be run in particular environments (Win32, Internet connection, python.net installed etc.) - If the setUp raises another exception, all tests using this resource fail. The common case we see is 'database in use', where PostgreSQL does not let us destroy or use as a template a database that has open connections to it. Also useful for general sanity checking of the environment - no point running the tests if we know they are going to fail or have skewed results. - A resource should have a pretest and posttest hooks. pretest is used for lightweight resource specific initialization (e.g. setUp creates a fresh database from a dump and pretest initializes the connection pool). posttest can be used to ensure tests cleaned up properly or other housekeeping (e.g. issuing a rollback). This could also apply to layers in the current environment. This eliminates tedious boilerplate from testcases. - A resource could provide useful data to the test runner. For example, if a resource says it doesn't use or lock any shared system resources, the test runner could decide to run tests in parallel. Although a less blue sky use would be specifying a dependancy on another resource. On another note, enforcing isolation of tests has been a continuous problem for us. For example, a developer registering a utility or otherwise mucking around with the global environment and forgetting to reset things in tearDown. This goes unnoticed for a while, and other tests get written that actually depend on this corruption. But at some point, the order the tests are run changes for some reason and suddenly test 500 starts failing. It turns out the global state has been screwed, and you have the fun task of tracking down which of the proceeding 499 tests screwed it. I think this is a use case for some sort of global posttest hook. Perhaps this would be best done by allowing people to write wrappers around the one-true-testrunner? This seems to be the simplest way of allowing customization of the test runner: def pretest(...): [...] def posttest(...): [...] if __name__ == '__main__': zope.test.testrunner.main(pretest=pretest,posttest=posttest) Other policy could also be configured - e.g. 'Run these tests or tests using this resource first. If any failures, don't bother running any more'. Or 'Stop running tests after 1 failure'. These sorts of policies are important for us as we run our tests in an automated environment (we can't commit to our trunk. Instead, we send a request to a daemon which runs the test suite and commits on our behalf if the tests all pass). Our full test suite currently takes 45 minutes to run and it is becoming an issue. We need to speed them up, determine slow tests in need of pruning or optimization, short circuit test runs and reduce test suite maintenance. So I should be able to get time to help (although I need to look closer at the SchoolTool and py.test runners to see if they are closer to what we need). -- Stuart Bishop <stuart@stuartbishop.net> http://www.stuartbishop.net/

Jim Fulton

2:22 p.m.

New subject: [Zope-dev] Re: New test runner work

Stuart Bishop wrote:

...

Jim Fulton wrote:

...
I'll note that I'm working on a newer test runner that I hope to use in Zope 2.9 and 3.2. The new test runner is a nearly complete rewrite to provide:

- A more flexible test runner that can be used for a variety of projects. The current test runner has been forked for ZODB, Zope 3, and Zope 2. That's why the Zope 3 version has features that are lacking in the Zope 2 version.

- Support for "layers" of tests, so that it can handle unit tests and functional tests.

- A slightly better UI.

- Tests (of the test runner itself :)

See:

http://svn.zope.org/zope.testing/trunk/src/zope/testing/testrunner.txt?view=...

http://svn.zope.org/zope.testing/trunk/src/zope/testing/testrunner.py?view=l...

Hi Jim.

I've been looking over this - fixing tests seems to take up a significant amount of our time, so I might have some interesting use cases.

A large proportion of our tests use a relational database. Some of them want an empty database, some of them want just the schema created but no data, some of them want the schema created and the data. Some of them need the component architecture, and some of them don't. Some of them need one or more twisted servers running, some of them don't.

Note that we mix and match. We have 4 different types of database fixture (none, empty, schema, populated), 2 different types of database connection mechanisms (psycopgda, psycopg), 2 types of CA fixture (none, loaded), and (currently) 4 states of external daemons needed. If we were to arrange this in layers, it would take 56 different layers, and this will double every time we add a new daemon, or add more database templates (e.g. fat for lots of sample data to go with the existing thin).

As a way of supporting this better, instead of specifying a layer a test could specify the list of resources it needs:

import testresources as r

class FooTest(unittest.TestCase): resources = [r.LaunchpadDb, r.Librarian, r.Component] [...]

class BarTest(unittest.TestCase): resources = [r.EmptyDb]

class BazTest(unittest.TestCase): resources = [r.LaunchpadDb, r.Librarian]

This is pretty much how layers work. Layers can be arranged in a DAG (much like a traditional multiple-inheritence class graph). So, you can model each resource as a layer and specific combinations of resources as layers. The test runner will attempt to run the layers in an order than minimizes set-up and tear-down of layers. ...

...

Some other nice things could be done with the resources:

- If the setUp raises NotImplementedError (or whatever), tests using this resource are skipped (and reported as skipped). This nicely handles tests that should only be run in particular environments (Win32, Internet connection, python.net installed etc.)

That's a good idea.

...

- If the setUp raises another exception, all tests using this resource fail. The common case we see is 'database in use', where PostgreSQL does not let us destroy or use as a template a database that has open connections to it. Also useful for general sanity checking of the environment - no point running the tests if we know they are going to fail or have skewed results.

Good.

...

- A resource should have a pretest and posttest hooks. pretest is used for lightweight resource specific initialization (e.g. setUp creates a fresh database from a dump and pretest initializes the connection pool). posttest can be used to ensure tests cleaned up properly or other housekeeping (e.g. issuing a rollback). This could also apply to layers in the current environment. This eliminates tedious boilerplate from testcases.

Ah, so the layer specifies additional per-test setUp and tearDown that is used in addition to the tests's own setUp and tearDown. This sounds reasonable.

...

- A resource could provide useful data to the test runner. For example, if a resource says it doesn't use or lock any shared system resources, the test runner could decide to run tests in parallel.

...

Although a less blue sky use would be specifying a dependancy on another resource.

This is handled by layers now. Layers have __bases__ -- layers are build on other layers. That's why they are caled layers. :)

...

On another note, enforcing isolation of tests has been a continuous problem for us. For example, a developer registering a utility or otherwise mucking around with the global environment and forgetting to reset things in tearDown. This goes unnoticed for a while, and other tests get written that actually depend on this corruption. But at some point, the order the tests are run changes for some reason and suddenly test 500 starts failing. It turns out the global state has been screwed, and you have the fun task of tracking down which of the proceeding 499 tests screwed it. I think this is a use case for some sort of global posttest hook.

How so?

...

Perhaps this would be best done by allowing people to write wrappers around the one-true-testrunner?

or we could simply provide such a hook, if it's needed. I think this sort of thing is better handled with layers.

...

This seems to be the simplest way of allowing customization of the test runner:

def pretest(...): [...] def posttest(...): [...] if __name__ == '__main__': zope.test.testrunner.main(pretest=pretest,posttest=posttest)

Other policy could also be configured - e.g. 'Run these tests or tests using this resource first. If any failures, don't bother running any more'. Or 'Stop running tests after 1 failure'.

without a lot more thought and details, it's not at all clear that the hooks you've specialized would provide the simplest way to do this.

...

These sorts of policies are important for us as we run our tests in an automated environment (we can't commit to our trunk. Instead, we send a request to a daemon which runs the test suite and commits on our behalf if the tests all pass).

Hm, seems rather restrictive... I guess making the new test runner class-based would more easily allow this sort of customization.

...

Our full test suite currently takes 45 minutes to run and it is becoming an issue.

Hm, I can see why you'd like to parallelize things. Of course, this only helps you if you have enough hardware to benefit from the parallelization. The test runner is already prepared to run layers in separate processes if they can't be torn down. I don't think it would take much to have an option to run the layers in separate processes, or to arrange the layers into sets run as separate processes. Of course, because the new test runner makes it easy to select test subsets in various ways, you could probably arrange the parallelization with a controller script.

...

We need to speed them up, determine slow tests in need of pruning or optimization, short circuit test runs and reduce test suite maintenance. So I should be able to get time to help (although I need to look closer at the SchoolTool and py.test runners to see if they are closer to what we need).

k Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org

Stuart Bishop

24 Aug 24 Aug

5:44 a.m.

New subject: [Zope-dev] Re: New test runner work

Jim Fulton wrote:

...

...
A large proportion of our tests use a relational database. Some of them want an empty database, some of them want just the schema created but no data, some of them want the schema created and the data. Some of them need the component architecture, and some of them don't. Some of them need one or more twisted servers running, some of them don't.

Note that we mix and match. We have 4 different types of database fixture (none, empty, schema, populated), 2 different types of database connection mechanisms (psycopgda, psycopg), 2 types of CA fixture (none, loaded), and (currently) 4 states of external daemons needed. If we were to arrange this in layers, it would take 56 different layers, and this will double every time we add a new daemon, or add more database templates (e.g. fat for lots of sample data to go with the existing thin).

As a way of supporting this better, instead of specifying a layer a test could specify the list of resources it needs:

import testresources as r

class FooTest(unittest.TestCase): resources = [r.LaunchpadDb, r.Librarian, r.Component] [...]

class BarTest(unittest.TestCase): resources = [r.EmptyDb]

class BazTest(unittest.TestCase): resources = [r.LaunchpadDb, r.Librarian]

This is pretty much how layers work. Layers can be arranged in a DAG (much like a traditional multiple-inheritence class graph). So, you can model each resource as a layer and specific combinations of resources as layers. The test runner will attempt to run the layers in an order than minimizes set-up and tear-down of layers.

So my example could be modeled using layers like: import layers as l class FooLayer(l.LaunchpadDb, l.Librarian, l.Component): pass class FooTest(unittest.TestCase): layer = 'FooLayer' [...] class BarLayer(l.LaunchpadDb, l.Librarian, l.Component): pass class BarTest(unitest.TestCase): layer = 'BarLayer' [...] class BazLayer(l.LaunchpadDb, l.Librarian): pass class BazTest(unittest.TestCase): layer = 'BazLayer' [...] In general I would need to define a layer for each test case (because the number of combinations make it impractical to explode all the possible combinations into a tree of layers, if for no other reason than naming them). If I tell the test runner to run all the tests, will the LaunchpadDb, Librarian and Component layers each be initialized just once? If I tell the test runner to run the Librarian layer tests, will all three tests be run? What happens if I go and define a new test: class LibTest(unittest.TestCase): layer = 'l.Librarian' [...] If I run all the tests, will the Librarian setup/teardown be run once (by running the tests in the order LibTest, BazTest, FooTest, BarTest and initializing the Librarian layer before the LaunchpadDb layer)? I expect not, as 'layer' indicates a heirarchy which isn't as useful to me as a set of resources. If layers don't work this way, it might be possible to emulate resources somehow: class ResourceTest(unittest.TestCase): @property def layer(self): return type(optimize_order(self.resources)) Howver, optimize_order would need to know about all the other tests so would really be the responsibility of the test runner (so it would need to be customized/overridden), and the test runner would need to support the layer attribute possibly being a class rather than a string.

...

Ah, so the layer specifies additional per-test setUp and tearDown that is used in addition to the tests's own setUp and tearDown. This sounds reasonable.

But what to call them? setUpPerTest? The pretest and posttest names I used are a bit sucky.

...

...
On another note, enforcing isolation of tests has been a continuous problem for us. For example, a developer registering a utility or otherwise mucking around with the global environment and forgetting to reset things in tearDown. This goes unnoticed for a while, and other tests get written that actually depend on this corruption. But at some point, the order the tests are run changes for some reason and suddenly test 500 starts failing. It turns out the global state has been screwed, and you have the fun task of tracking down which of the proceeding 499 tests screwed it. I think this is a use case for some sort of global posttest hook.

How so?

In order to diagnose the problem I describe (which has happened far too often!), you would add a posttest check that is run after each test. The first test that fails due to this check is the culprit. I see now though that this could be easily modeled by having a 'global' or 'base' layer in your test suite, and mandate its use by all tests in your application. Or the check could go in a more specific layer if appropriate.

...

...
These sorts of policies are important

...
for us as we run our tests in an automated environment (we can't commit to our trunk. Instead, we send a request to a daemon which runs the test suite and commits on our behalf if the tests all pass).

Hm, seems rather restrictive...

We like it ;) Our RCS (Bazaar) allows us to trivially merge branches into other branches, so we can avoid fallout from any delays in landing stuff to the trunk. And it is guarenteed that any changes landing in the trunk run, and more importantly, run with the current versions of all the dependant libraries and tools. So for example, if we add some sanity checks to SQLObject stopping certain dangerous operations, nobody can accidently commit code that breaks under the new version. It means that every day the trunk is rolled out to a staging server automatically and actually runs, and production rollouts can be done with confidence by simply picking an arbitrary revision on the trunk, tagging it and pushing it out.

...

I guess making the new test runner class-based would more easily allow this sort of customization.

...
Our full test suite currently takes 45 minutes to run and it is becoming an issue.

Hm, I can see why you'd like to parallelize things. Of course, this only helps you if you have enough hardware to benefit from the parallelization.

I think the parallelization might take some work (perhaps not the test runner, but making our existing test suite work with it ;) ). I think there are lower hanging fruit to reach for first ;) -- Stuart Bishop <stuart@stuartbishop.net> http://www.stuartbishop.net/

Jim Fulton

12:05 p.m.

New subject: [Zope-dev] Re: New test runner work

Stuart Bishop wrote:

...

Jim Fulton wrote:

...
...
A large proportion of our tests use a relational database. Some of them want an empty database, some of them want just the schema created but no data, some of them want the schema created and the data. Some of them need the component architecture, and some of them don't. Some of them need one or more twisted servers running, some of them don't.

Note that we mix and match. We have 4 different types of database fixture (none, empty, schema, populated), 2 different types of database connection mechanisms (psycopgda, psycopg), 2 types of CA fixture (none, loaded), and (currently) 4 states of external daemons needed. If we were to arrange this in layers, it would take 56 different layers, and this will double every time we add a new daemon, or add more database templates (e.g. fat for lots of sample data to go with the existing thin).

As a way of supporting this better, instead of specifying a layer a test could specify the list of resources it needs:

import testresources as r

class FooTest(unittest.TestCase): resources = [r.LaunchpadDb, r.Librarian, r.Component] [...]

class BarTest(unittest.TestCase): resources = [r.EmptyDb]

class BazTest(unittest.TestCase): resources = [r.LaunchpadDb, r.Librarian]

This is pretty much how layers work. Layers can be arranged in a DAG (much like a traditional multiple-inheritence class graph). So, you can model each resource as a layer and specific combinations of resources as layers. The test runner will attempt to run the layers in an order than minimizes set-up and tear-down of layers.

So my example could be modeled using layers like:

import layers as l

class FooLayer(l.LaunchpadDb, l.Librarian, l.Component): pass class FooTest(unittest.TestCase): layer = 'FooLayer' [...]

class BarLayer(l.LaunchpadDb, l.Librarian, l.Component): pass class BarTest(unitest.TestCase): layer = 'BarLayer' [...]

class BazLayer(l.LaunchpadDb, l.Librarian): pass class BazTest(unittest.TestCase): layer = 'BazLayer' [...]

In general I would need to define a layer for each test case (because the number of combinations make it impractical to explode all the possible combinations into a tree of layers, if for no other reason than naming them).

That's too bad. Perhaps layers don't fit your need then.

...

If I tell the test runner to run all the tests, will the LaunchpadDb, Librarian and Component layers each be initialized just once?

If all of the tests means these 3, then yes.

...

If I tell the test runner to run the Librarian layer tests, will all three tests be run?

No, no tests will be run. None of the tests are in the librarian layer. They are in layers build on the librarian layer.

...

What happens if I go and define a new test:

class LibTest(unittest.TestCase): layer = 'l.Librarian' [...]

If I run all the tests, will the Librarian setup/teardown be run once (by running the tests in the order LibTest, BazTest, FooTest, BarTest and initializing the Librarian layer before the LaunchpadDb layer)?

Yes

...

I expect not, as 'layer' indicates a heirarchy which isn't as useful to me as a set of resources.

I don't follow this.

...

If layers don't work this way, it might be possible to emulate resources somehow:

If each test *really* has a unique set of resources, then perhaps laters don't fit.

...

class ResourceTest(unittest.TestCase): @property def layer(self): return type(optimize_order(self.resources))

Howver, optimize_order would need to know about all the other tests so would really be the responsibility of the test runner (so it would need to be customized/overridden), and the test runner would need to support the layer attribute possibly being a class rather than a string.

Layers can be classes. In fact, I typically use classes with class methods for setUp and tearDown.

...

...
Ah, so the layer specifies additional per-test setUp and tearDown that is used in addition to the tests's own setUp and tearDown. This sounds reasonable.

But what to call them? setUpPerTest? The pretest and posttest names I used are a bit sucky.

<shrug> testSetUp?

...

...
...
On another note, enforcing isolation of tests has been a continuous problem for us. For example, a developer registering a utility or otherwise mucking around with the global environment and forgetting to reset things in tearDown. This goes unnoticed for a while, and other tests get written that actually depend on this corruption. But at some point, the order the tests are run changes for some reason and suddenly test 500 starts failing. It turns out the global state has been screwed, and you have the fun task of tracking down which of the proceeding 499 tests screwed it. I think this is a use case for some sort of global posttest hook.

How so?

In order to diagnose the problem I describe (which has happened far too often!), you would add a posttest check that is run after each test. The first test that fails due to this check is the culprit.

so your post-test thing would check to make sure there weren't any left-over bits from tests. This makes sense.

...

I see now though that this could be easily modeled by having a 'global' or 'base' layer in your test suite, and mandate its use by all tests in your application. Or the check could go in a more specific layer if appropriate.

I think it makes more sense to be able to provide a hook. I would be inclined to make this a command-line options. (2 actually). Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org

7553

Age (days ago)

7556

Last active (days ago)

List overview

8 comments

5 participants

participants (5)

Andreas Jung
Jim Fulton
Sidnei da Silva
Stefane Fermigier
Stuart Bishop