[ZODB-Dev] Hanging ZEO-client hangs all other ZEO-clients?
Tim Peters
tim at zope.com
Thu Apr 14 15:23:28 EDT 2005
[Chris Withers]
>> Out of interest, why are you using DirectoryStorage?
[Dario Lopez-Kästen]
> I chose it for several reasons:
I don't want to talk you out of it, but since this a general list I feel
compelled <wink> to respond to these points wrt current FileStorage. You're
using a by-now very old Zope (2.6.2), and may not be aware of the info at:
http://zope.org/Wikis/ZODB/FileStorageBackup
> 1) we are storing large amounts of binary files (PDF, Word, Matlab, Zip,
> tar-balls, etc) in this particular application (it's a student portal,
> course admin portal and an LMS). While we are not yet in the
> multigigabyte realm, we are storing archive copies of all the previous
> year's materials, which will eventually grow to be a lot of stuff.
If I understand correctly, DirectoryStorage and FileStorage both store this
stuff in giant pickles -- and then there's no cause for "large" total size
difference I'm aware of. The storage comparison matrix at
http://cvs.zope.org/ZODB3/Doc/storages.html?rev=1
says DirectoryStorage requires "Roughly 30% more [disk] space than Data.fs",
not less disk space. Indeed, it's hard to imagine any non-compressing
scheme that could require less total disk space than FileStorage.
> 2) There is the issue of huge Data.fs fiels and making daily backups. We
> need to have incremental backups
See the link above: repozo.py supports incremental Data.fs backup, taking
(using -Q) time roughly proportional to the increase in Data.fs size since
the most recent backup. It goes fast!
> 3) HA - while DirStor is not a HA-tool per se, it provides the necessary
> tools for building something that provide some aspects of HA, ie. the
> replication features, etc.
Unsure what "HA" means to you. "High availability", perhaps? ZRS is
available for FileStorage, but it's admittedly not free:
http://www.zope.com/Products/ZRS.html
> 4) Maintenance. While I have not yet dared to pack the DB, the mere size
> of the database will make packing a non-trivial operation memorywise in
> FielStorage. DirStor does not have the same memory requirements when
> packing.
The size of the objects in the database has little to do with memory
consumed by a FileStorage pack; it's more the number of distinct object
revisions at work, since an in-memory object reachability graph is
constructed. I'm not sure how DirectoryStorage could perform packing
without constructing a similar reachability graph (Toby?).
The last time Jeremy and I watched a pack work on a 20GB Data.fs, on a very
slow Solaris box, we noticed that it was only taking 10-20% of the RAM, and
regretted the then-last round of packing changes, which favored reducing RAM
usage at the cost of increasing runtime. That appears to have been a wrong
tradeoff for most modern boxes.
Then again, data storages are growing ever bigger too. It's very nice that
DirectoryStorage's direct RAM consumption is independent of the number of
objects.
> 5) POSKeyErrors. We where getting quite a few of those, and that scared
> me. with DirStor, I do not see them as much as before.
Do you see _any_?
FWIW, several nasty causes (bugs in ZODB and Zope) for POSKeyErrors have
been fixed since Zope 2.6.2, and reports of POSKeyErrors from current
Zope/ZODB installations are conspicuous by absence.
Toby, I know (or think I know <wink>) that DirectoryStorage won't commit a
transaction containing dangling references. I think that's great, and I'd
like (if possible) to introduce such a check at a higher level, so that all
storages would benefit. Does DirectoryStorage do something beyond that
check specifically aimed at preventing POSKeyErrors?
...
More information about the ZODB-Dev
mailing list