[ZODB-Dev] Major refactoring of the ZEO ClientStorage Blob Cache
Christian Theune
ct at gocept.com
Wed Dec 3 01:50:21 EST 2008
Hi,
On Tue, 2008-12-02 at 12:03 -0500, Jim Fulton wrote:
> ZEO has two modes for dealing with client blob data, shared, and non-
> shared. In shared mode, a distributed file system is used to share a
> blob directory with a ZEO server. This requires management of a
> distributed file system, in addition to the ZEO protocol. Any caching
> is provided by the distributed file system.
>
> In non-shared mode, blob data are downloaded to the ZEO client using
> the ZEO protocol. No distributed file-system is needed and blob files
> are cached locally. Unfortunately, the current implementation provides
> no facilities for managing the client cache. There are no provisions
> in the ZEO client software for removing unused blob files and the blob
> implementation makes almost no provision for blob file removal.
>
> I'm working on refactoring ClientStorage's handling of non-shared blob
> data. I'm implementing a mechanism for periodically cleaning out
> files that haven't been accessed in a while. As part of this, I'm
> going to radically change the layout of the ClientStorage's non-shared
> blob directory.
>
> Currently, the bushy layout, with deeply nested directories is used.
> While I think this layout makes some sense on the server, I don't
> think it makes much sense on the client. Cleaning up unused blob
> files is complicated by the need to clean up directories too. I'm
> going to go for a fairly flat layout. There will be a small number
> (997) of directories and blob files will reside directly in these
> directories. (The directory will be chosen by taking the remainder of
> dividing an oid by 997.)
Any specific reason for this specific number?
> It appears that modern operating systems can
> handle large directories just fine. I've created directories with 1
> million files on Linux/Ext, Mac OS X/HFS+, and Windows XP/NTFS and saw
> no degredation in performance as the number of files in a directory
> increased.
FTR: The reason for introducing the bushy layout is due to restrictions
on the number of directory entries a directory can contain which seem to
be a different restriction than the number of file entries a directory
can contain. At least on ext3 I can't create more than 65k directories
in a directory while I still can create a lot more files in the same
directory. Wikipedia has a generally good overview and comparison
between file systems but doesn't cover the maximum number of directory
entries per directory.
> I plan to have ClientStorage use the file layout mentioned above. The
> ClientStorage constructor will fail if an older layout is found. An
> alternative is to just log a warning and ignore the existing
> directories, as the new directories will have non-overlapping names.
>
> I mention this both as a heads up and to see if anyone can point out a
> problem with my approach. I have a feeling that no one is using non-
> shared client blob directories for anything important yet, so I assume
> the change won't have much effect.
I am. I'd prefer if you'd fail on the directory structure instead of
mixing it with the new approach.
Christian
--
Christian Theune · ct at gocept.com
gocept gmbh & co. kg · forsterstraße 29 · 06112 halle (saale) · germany
http://gocept.com · tel +49 345 1229889 7 · fax +49 345 1229889 1
Zope and Plone consulting and development
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url : http://mail.zope.org/pipermail/zodb-dev/attachments/20081203/0b8f61a8/attachment.bin
More information about the ZODB-Dev
mailing list