[ZODB-Dev] [ZEO] rather long periods of unresponsiveness
Dieter Maurer
dieter at handshake.de
Sat Dec 18 12:27:54 EST 2004
In our installation, ZEO (from ZODB 3.2) runs on a high availability cluster.
The cluster periodically probes ZEO for responsiveness and
restarts it when it becomes irresponsive.
When a larger transaction is committed (I checked with a
transaction of size 35 MB affecting 250.000 objects),
then ZEO is irresponsive for about a minute.
The reason for this long irresponsiveness time lies in a special (facinating)
implementation of the two phase commit by ZEO:
In the first phase, ZEO does not store the changed object data in the
storage directly but puts it into a "CommitLog" (essentially
a temporary file).
Only in the "vote" (end of first commit phase),
ZEO calls the storage's "tpc_begin" (and thereby acquires
the storage's commit lock) and then transfers the changed
data from the "CommitLog" to the storage.
Depending on the size of the transaction and the number
of affected objects, this can take a long time and
ZEO is irresponsive during this time.
What do you think about executing "vote" (more precisely
"_vote") in a separate thread when the transaction is sufficiently
large (with respect to size or number of modified objects)?
This would allow the main ZEO thread to
process other unrelated requests in parallel.
--
Dieter
More information about the ZODB-Dev
mailing list