[Zope] OT: Upgrading postgresql
Jim Penny
jpenny@universal-fasteners.com
Thu, 8 Feb 2001 11:42:47 -0500
On Wed, Feb 07, 2001 at 10:13:03PM -0500, Jens Vagelpohl wrote:
> here's an interesting tidbit i found when comparing ZPyGreSQLDA and ZPoPyDA
> on a test system (RH 6.2, PostGreSQL 7.0.2):
>
> i wrote a standalone script (run independent of Zope from the command line)
> that uses python's threading facilities to create five threads. all threads
> immediately call a DTML method in Zope which iterates over the result of a
> large ZSQL query and renders it.
>
> my testing had quite unexpected results. in a nutshell, in my setup the
> alleged thread-safety of ZPoPyDA did not give me any advantages. i found
> ZPyGreSQLDA to be more predictable and faster.
>
> ZPyGreSQLDA serializes all queries. the five threads and their requests were
> executed one after another, every single request took about 15 seconds. the
> first thread finished at 15 seconds, the second at 30 seconds, and so forth.
>
> ZPoPyDA on the other hand had me wait a full 50 seconds before i got the
> first result set back. granted, after i had the first set the others came
> back in about 4-5 second intervals, but the fact that i had to wait that
> long for the first thread to return swayed my decision in favor of
> ZPyGreSQLDA.
>
> i guess what i am trying to say is that "thread safety" does not
> automatically mean "better" or "faster". you have to take your usage
> patterns into account, especially when your site does much more reads from
> the database.
>
> jens
Yes, but this is hardly surprising. You set off 5 large queries
simultaneously. The only conceivable reason that the threaded version
could take less time than the non-threaded version is if the threads were
blocking in such a way that total head motion was reduced. In fact,
you claim to see almost no difference, 5 * 15 = 75 vs. 50 + 4 * 5 = 70
total seconds, threaded slightly better.
But, I suspect that the usual access pattern is lots of small requests
with an occasional big request. Here threading is a win. The small
requests can get in and out, allowing normal user life to go on, while
the pig is somewhat penalized in terms of total time.
Think in terms of batch versus timesharing. In most circumstances,
batch is more efficient, especially for large jobs. But most people
find timesharing more comfortable, prefering that the large jobs be
penalized in order that they may continue to get reasonable response
for the ordinary smallish jobs that dominate most people's lives.
Jim Penny