Strategies for testing generated sql?
This is a very general question: I'm looking for ideas on how to effectively test dynamically generated sql queries. Both unit and functional test ideas welcome. We have a bunch of views (in the zope 3 / Five sense, but that's not really relevant) which typically construct a Shared.DC.ZRDB.DA.nvSQL instance, passing it a dictionary to use as the "namespace" mapping (I'm simplifying, there's more indirection than that). We have a bunch of tests for this stuff, but typically they just assert that the generated sql query exactly matches some expected string. This strikes me as only a minimal useful regression test when it passes, but it suffers from a number of drawbacks: * When a test fails, it can be very difficult to spot what changed. I'm getting really tired of staring at stuff like (contrived example, but some of my test failures actually look pretty close to this): AssertionError: "select aaaa, aabb, aacc, aadd, aaee, aaff, aagg, bbaa, bbbb, bbcc, bbdd, bbee, bbff, bbgg, ccaa, ccbb, cccc, ccdd, ccee, ccff, ccgg, ddaa, ddbb, ddcc, dddd, ddee, ddff, ddgg, eeaa, eebb, eecc, eedd, eeee, eeff, eegg, ffaa, ffbb, ffcc, ffdd, ffee, ffff, ffgg, ggaa, ggbb, ggcc, ggdd, ggee, ggff, gggg, case when bbcc != '' then bbcc when aaddd != '' then aaddd end as bbcc from blarftable b where (b.aacc = fred and b.aaee != joe) order by aabb" != "select aaaa, aabb, aacc, aadd, aaee, aaff, aagg, bbaa, bbbb, bbcc, bbdd, bbee, bbff, bbgg, ccaa, ccdd, cccc, ccbb, ccee, ccff, ccgg, ddaa, ddbb, ddcc, dddd, ddee, ddff, ddgg, eeaa, eebb, eecc, eedd, eeee, eeff, eegg, ffaa, ffbb, ffcc, ffdd, ffee, ffff, ffgg, ggaa, ggbb, ggcc, ggdd, ggee, ggff, gggg, case when bbcc != '' then bbcc when aaddd != '' then aaddd end as bbcc from blarftable b where (b.aacc = fred and b.aaee != joe) order by aabb" Gee thanks! * I think it would be a lot more useful to be able to reliably parse out and make assertions about some of the sub-clauses. So I could write tests that make assertions like: - this query is selecting columns a, b, c, d in that order. - the "where" clause should compare column X against value Y etc. I could of course do some ad-hoc stuff with string methods and maybe regexes, but I have to wonder if there's some sql-specific library that I could leverage. Googling suggests that there's a parser in gadfly (of course), and an example sql parser in pyparsing. Worth the effort or should I stick with simple string stuff? Functionally, we are missing some important testables: * we have no way to verify that these queries are syntactically correct sql. * we have no way to verify that these queries behave as expected against a sample data set. How do people test this sort of thing? Do you go whole-hog and fire up MySQL or whatever? Or use gadfly? sqlite? or what? Are there other tools I should know about? -- Paul Winkler http://www.slinkp.com
The problem is what you want to test. I'm not familiar with Zope 3, therefore I'm talk on the base of ZSQLMethod.SQL.SQL from Zope 2. If I understand correctly, you've some high level construct which creates an nvSQL instance. Now, if you put yourself at the end of the line, what you need to do is to parse the output of nvSQL, in order to see if the generated SQL is correct. What you're doing therefore is to test whether nvSQL is able to generate correctly the SQL. But I think this is not what you want to test. You could safely assume that, given the proper input, nvSQL will generate the proper output. Therefore, what you should test is that the generated nvSQL has the appropriate characteristics. I will try to post a sample for Zope2, just to give an idea of what could be done. Regards Marco On 4/12/06, Paul Winkler <pw_lists@slinkp.com> wrote:
This is a very general question: I'm looking for ideas on how to effectively test dynamically generated sql queries. Both unit and functional test ideas welcome.
We have a bunch of views (in the zope 3 / Five sense, but that's not really relevant) which typically construct a Shared.DC.ZRDB.DA.nvSQL instance, passing it a dictionary to use as the "namespace" mapping (I'm simplifying, there's more indirection than that).
We have a bunch of tests for this stuff, but typically they just assert that the generated sql query exactly matches some expected string. This strikes me as only a minimal useful regression test when it passes, but it suffers from a number of drawbacks:
* When a test fails, it can be very difficult to spot what changed. I'm getting really tired of staring at stuff like (contrived example, but some of my test failures actually look pretty close to this):
AssertionError: "select aaaa, aabb, aacc, aadd, aaee, aaff, aagg, bbaa, bbbb, bbcc, bbdd, bbee, bbff, bbgg, ccaa, ccbb, cccc, ccdd, ccee, ccff, ccgg, ddaa, ddbb, ddcc, dddd, ddee, ddff, ddgg, eeaa, eebb, eecc, eedd, eeee, eeff, eegg, ffaa, ffbb, ffcc, ffdd, ffee, ffff, ffgg, ggaa, ggbb, ggcc, ggdd, ggee, ggff, gggg, case when bbcc != '' then bbcc when aaddd != '' then aaddd end as bbcc from blarftable b where (b.aacc = fred and b.aaee != joe) order by aabb" != "select aaaa, aabb, aacc, aadd, aaee, aaff, aagg, bbaa, bbbb, bbcc, bbdd, bbee, bbff, bbgg, ccaa, ccdd, cccc, ccbb, ccee, ccff, ccgg, ddaa, ddbb, ddcc, dddd, ddee, ddff, ddgg, eeaa, eebb, eecc, eedd, eeee, eeff, eegg, ffaa, ffbb, ffcc, ffdd, ffee, ffff, ffgg, ggaa, ggbb, ggcc, ggdd, ggee, ggff, gggg, case when bbcc != '' then bbcc when aaddd != '' then aaddd end as bbcc from blarftable b where (b.aacc = fred and b.aaee != joe) order by aabb"
Gee thanks!
* I think it would be a lot more useful to be able to reliably parse out and make assertions about some of the sub-clauses. So I could write tests that make assertions like:
- this query is selecting columns a, b, c, d in that order.
- the "where" clause should compare column X against value Y
etc.
I could of course do some ad-hoc stuff with string methods and maybe regexes, but I have to wonder if there's some sql-specific library that I could leverage. Googling suggests that there's a parser in gadfly (of course), and an example sql parser in pyparsing. Worth the effort or should I stick with simple string stuff?
Functionally, we are missing some important testables:
* we have no way to verify that these queries are syntactically correct sql.
* we have no way to verify that these queries behave as expected against a sample data set.
How do people test this sort of thing? Do you go whole-hog and fire up MySQL or whatever? Or use gadfly? sqlite? or what?
Are there other tools I should know about?
--
Paul Winkler http://www.slinkp.com _______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
-- Icube Srl http://www.icube.it/
On Wed, Apr 12, 2006 at 06:51:18PM +0200, Marco Bizzarri wrote:
The problem is what you want to test. I'm not familiar with Zope 3, therefore I'm talk on the base of ZSQLMethod.SQL.SQL from Zope 2.
If I understand correctly, you've some high level construct which creates an nvSQL instance.
Correct.
Now, if you put yourself at the end of the line, what you need to do is to parse the output of nvSQL, in order to see if the generated SQL is correct.
What you're doing therefore is to test whether nvSQL is able to generate correctly the SQL. But I think this is not what you want to test. You could safely assume that, given the proper input, nvSQL will generate the proper output.
Therefore, what you should test is that the generated nvSQL has the appropriate characteristics.
Exactly! I am also open to suggestions on what kind of characteristics I should be interested in. I have never written tests of generated SQL before.
I will try to post a sample for Zope2, just to give an idea of what could be done.
That would be very helpful, thank you. -- Paul Winkler http://www.slinkp.com
Ok, here it is. It is a demo product, which has one class, GenerateInsert, which will generate an insert based on your parameters. I've also written a small test, which tries to show what I would do: test that given the proper parameters, it will generate the proper string. The case I'm showing is very simple, but it could be extended to other cases as well... Also, if you want to take a look at some tests in database environment, you could take a look at paflow (www.paflow.it) where there is quite a lot of tests... Regards Marco On 4/12/06, Paul Winkler <pw_lists@slinkp.com> wrote:
On Wed, Apr 12, 2006 at 06:51:18PM +0200, Marco Bizzarri wrote:
The problem is what you want to test. I'm not familiar with Zope 3, therefore I'm talk on the base of ZSQLMethod.SQL.SQL from Zope 2.
If I understand correctly, you've some high level construct which creates an nvSQL instance.
Correct.
Now, if you put yourself at the end of the line, what you need to do is to parse the output of nvSQL, in order to see if the generated SQL is correct.
What you're doing therefore is to test whether nvSQL is able to generate correctly the SQL. But I think this is not what you want to test. You could safely assume that, given the proper input, nvSQL will generate the proper output.
Therefore, what you should test is that the generated nvSQL has the appropriate characteristics.
Exactly!
I am also open to suggestions on what kind of characteristics I should be interested in. I have never written tests of generated SQL before.
I will try to post a sample for Zope2, just to give an idea of what could be done.
That would be very helpful, thank you.
--
Paul Winkler http://www.slinkp.com _______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
-- Icube Srl http://www.icube.it/
--On 12. April 2006 12:32:29 -0400 Paul Winkler <pw_lists@slinkp.com> wrote:
How do people test this sort of thing? Do you go whole-hog and fire up MySQL or whatever? Or use gadfly? sqlite? or what?
I've had a project some yrs ago where we had to parse some kind of query language and translate it into Oracle SQL including Oracle Intermedia statements. At that time I've implemented something unittest-like to check the correctness of the generated SQL both for small queries and combinations of queries. However I did not check the queries in real. The correctness of the parser and the SQL generator was sufficient at that time. Btw. Oracle raised more problems with generated SQL than we would have had time to care about the SQL correctness :-) -aj -- ZOPYX Ltd. & Co. KG - Charlottenstr. 37/1 - 72070 Tübingen - Germany Web: www.zopyx.com - Email: info@zopyx.com - Phone +49 - 7071 - 793376 E-Publishing, Python, Zope & Plone development and consulting
Paul Winkler wrote at 2006-4-12 12:32 -0400:
... Functionally, we are missing some important testables: ... * we have no way to verify that these queries behave as expected against a sample data set.
If you want to test this, then do it. It will also cover all other test requirements (e.g. that the SQL is well formed). Of course, you will need an RDBMS to perform the tests against.
How do people test this sort of thing? Do you go whole-hog and fire up MySQL or whatever? Or use gadfly? sqlite? or what?
We do something like this for "Postgres". The test setup assumes that a Postgres installation is running and available for the test. The test is executed against a special test database, copied on test start from a template. After the test, the test database is deleted again. Drawbacks: * there can not be more than a single test running in parallel. Parallel tests would cause Postgres to complain about already existing test databases * deleting the test database was a nightmare. Ever and ever again, Postgres complained about a deletion trial while there were still some uses of the database: sometimes, some transactions have not been committed/aborted; sometimes Postgres connections were still open; sometimes, Postgres did not behave synchronously (the connection was closed, but some part of Postgres was not yet informed about the fact). -- Dieter
On 4/12/06, Paul Winkler <pw_lists@slinkp.com> wrote:
Functionally, we are missing some important testables:
* we have no way to verify that these queries are syntactically correct sql.
* we have no way to verify that these queries behave as expected against a sample data set.
How do people test this sort of thing? Do you go whole-hog and fire up MySQL or whatever? Or use gadfly? sqlite? or what?
Are there other tools I should know about?
Sorry, It is taking a little more than I expected to prepare the sample. However, I can provide some idea on this topic. We do this using ZopeTestCase. Basically, we have a test database (postgresql, but it can be used for whatever transactional database). Each user has its own database, based on its username. The test does actual queries on the test database. At the end of each test (single test, not test suite), there is a rollback (if you're using a Z Database Adapter, this is done automatically by the transaction machinery). So, each single test runs with a "fresh" copy of the database. Regards Marco -- Icube Srl http://www.icube.it/
participants (4)
-
Andreas Jung -
Dieter Maurer -
Marco Bizzarri -
Paul Winkler