[Zope] Size of Data.fs: how big is too big?
Richard Barrett
R.Barrett@ftel.co.uk
Fri, 30 Nov 2001 11:23:50 +0000
At 14:09 29/11/2001 -0700, Paul Horbal wrote:
>Hi everyone,
>
>My Zope site has been growing fairly large lately and I'm beginning to
>wonder at what point I should consider moving files out of the Zope filing
>system and onto a static filesystem.
>
>At this point, I haven't noticed any performance issues at
>all. Currently, Data.fs is about 300 MB in size. The site is running on
>a Sun Netra X1 (400 MHz UltraSparc IIe) with 512 MB of RAM.
>
>Should I be worried about adding more large files into Data.fs?
>
>thanks,
>Paul.
Just to relate my decisions on this topic. One of our Zope sites had a
similar size of Data.fs to yours (and growing fairly rapidly). When I
looked at the contents it was clear that a large part of the size was from
big blob objects such as PDFs, GIFS and JPEGs, and various Microsoft
product data files (Word and Powerpoint being favorite).
I wrote an external method which selectively decants the content of
qualifying objects into ExtFile and ExtImage objects. The big blobs end up
in the UNIX file system and the stub objects left in Data.fs are much
smaller; in my case the Data.fs shrank from over 350 Mb to less than 30 Mb.
I now run the decanting function regularly as well as advising content
providers to use ExtFile/ExtImage for PDFs and such.
My rationale for this approach was twofold:
1. I couldn't see any real benefit of inflating Zope's object database with
big blobs of fairly static, opaque data. Indeed, my guess was that it was
more likely to damage the performance of Zope and inflate its process size,
although I never tried to prove if this was the case.
2. Big, fairly static blobs of opaque data adversely affect incremental
backup performance for the file system containing the Data.fs. A one byte
change in any object means all the big, unchanged blobs also become
candidates for being backed up yet again. With the big blobs out in the
file system, incremental backups go a lot quicker.
This approach won't save in total file space occupied - it wasn't intended
to - but I figure that it plays to the strengths and away from the
weaknesses of both the Zope object database and the UNIX file system.