[Zope] Zope Performance and Large Databases

Michel Pelletier michel@digicool.com
Tue, 7 Sep 1999 16:48:20 -0400


> -----Original Message-----
> From: davidbro@namshub.org [mailto:davidbro@namshub.org]
> Sent: Tuesday, September 07, 1999 3:24 PM
> To: zope@zope.org
> Subject: [Zope] Zope Performance and Large Databases
> 
> 
> How much stuff can Zope (or more properly ZODB3) handle in a folder
> before things get hairy and/or slow and buggy?

Folder are not meant to hold thousands and thousands of objects *in one
folder*.  For this, you should store your data by more efficient means,
like a BTree, multiple heirarchical(sp?) folders, or a relational
database.  The same things happens with, for example, linux directories.
They are not meant to hold on the order of a thousand files efficiently.

If you do pack a folder full of objects, things will not get 'hairy' or
buggy, just slow.  Note that *traversal* doesn't get noticably slower
(meaning, the 'fetching' of /X/Y/Z when 'Y' contains thousands of
objects) just the *listing* and managment of folders gets very slow.

> I want to use Zope as the center of my Home Automation intranet.

Cool.

> One of the applications of that HA intranet would be to control the
> MP3's playing throughout the house.  Now, it seems that the best place
> to store the information about all the MP3's (artist, album, genre,
> actual location, etc) would be in a relational database that Zope
> hits, but I also wondered if I could use ZClasses and ZCatalog to
> manage the information.

Sure.  Either solution would give reasonable performance over a small to
medium scale.  I would personally use ZClasses with ZCatalog.  We have a
customer who keeps almost 10000 objects in ZCatalog with no perceptable
delay.

Because ZCatalog uses BTrees, finding one object out of a million
typically requires less than 20 'comparison' operations.  I think that
'theoretically' ZCatalog can scale into the order of hundreds of
thousands of objects, but that is just my personal opinion (cuz I wrote
it) not any kind of official guarantee.
 
> But being that I have several gigs of MP3's (don't start about
> copyright stuff, these are ripped from my CD collection), we're
> talking about a large number of ZClass instances (one per song) stored
> in the ODB.

Be aware that this will make your Zope object database massive and
increse your memory requirements.  Perhaps it would be simpler to just
serve the actualy MP3s out of a static Apache and let Zope just handle
the 'meta' information?  If your running Linux, don't forget that ext2fs
has a 2GB file size limit.

> How robust is the ODB in these sorts of situations?  Has there been
> any testing of Zope with large (I had a boss that didn't think a
> database was interesting until it had at least a gig of information in
> it) amounts of data?  I know that current file system limits in Linux
> put an artificial cap on DB size, but you can still get really big
> before hitting the limitation.

One things I think you are bit confused about is the difference between
storage (ZODB) and indexing the storage (ZCatalog).  Their respective
features and performance issues should be looked at seperately.
 
ZODB is very robust and has no problem scaling into gigs of data.  of
course, many databases are also as robust as this.

-Michel