[ZODB-Dev] RelStorage 1.5.0b1 dry-run > two phase pack, better pack lock behaviour
Martijn Pieters
mj at zopatista.com
Tue Feb 22 11:25:16 EST 2011
Hi,
I was already investigating the possibility to split the RelStorage
packing process up into smaller chunks.
Due to the expected load on the Oracle cluster during a pack, we'll
have to run the pack at night and want to be absolutely certain that
database is ready for normal site operations again the next day. With
a 40+GB database (hasn't been packed for it's entire run, more than 2
years now) we are not confident packing will be done in one night.
To at least get a handle on how much work the packing is going to be,
and to have a nice stopping point, I looked at splitting pre-pack and
pack operations out into two separate steps. To my delight I saw that
the 1.5.0 beta already implements basically running only the pre-pack
phase (the --dry-run option). From there I created the attached patch,
one that renames the dry-run op into a 'prepack only' option, and adds
another option to skip the pre-pack and just use whatever is present
in the pack tables.
I haven't yet actually run this code, but the change isn't big. I
didn't find any relevant tests to update. Anyone want to venture some
feedback?
Helge Tesdal and I also looked into the pack operation itself, and how
it uses a duty cycle to give other transactions a chance to commit
during pack. We think there might be a better pattern to handle the
locking.
Currently, with the default values, the pack operation will hold the
commit lock for 5 seconds, pack, then release the lock for 5 more
seconds, repeating until done. With various options you can alter
these timings, but the basic principle is the same. For Oracle, where
the commit lock has a time-out, this means that packing can fail
because the commit lock times out. For all backends, Oracle or
otherwise, commits elsewhere on a site cluster will have to wait long
periods of time before they can proceed, leading to severe delays on a
heavily trafficked website.
With the variable time-out for requesting a commit lock on Oracle
however, there is a different option. I do not know if MySQL and
Postgres can support this too, I haven't looked into their lock
acquisition options, but the following relies on lock acquisition
timeouts.
Consider the following packing algorithm:
* Use a short timeout (say 1 second) to request the commit lock.
* If it doesn't time out:
* run one batch update cycle (up to 100 transactions processed).
* optionally clean out associated blobs
* unlock
* loop back up
* If it does time out:
* commit lock is busy, so back off by sleeping a bit
* loop back up
By timing out the lock request quickly, you give commits from
non-packing zope transactions right of way. Packing truly becomes a
non-intrusive background operation. Is this a viable scenario?
--
Martijn Pieters
-------------- next part --------------
A non-text attachment was scrubbed...
Name: twophasepack.patch
Type: application/octet-stream
Size: 7158 bytes
Desc: not available
Url : http://mail.zope.org/pipermail/zodb-dev/attachments/20110222/e015e1aa/attachment.obj
More information about the ZODB-Dev
mailing list