From: "Ausum Studio" <ausum_studio@hotmail.com>
We are in the need to put more text information into Zope's database without exceeding its file-size. I know there's a product CompressedStorage out there, but it's kind of old. Hence I'd like to know whether there's succesful experiences on using it and/or compressing stuff inside ZODB.
Which is the best approach for a task like this?
We used gzip (via an external method) to compress data before storing it in a meta-data table of a zcatalog. We then decompressed the meta-data on-the-fly at retrieval time. We found that this was faster than accessing the object at retrieval time to get the data field (zcatalog meta-data is available without having to access the indexed object itself). We did however encounter a couple of difficulties with this approach: (1) it is only scalable to a certain point. When the size of the zcatalog (indexes + metadata) exceeds available RAM, then swapping eliminates the speed advantage of storing compressed metadata (we went back to accessing the object itself in order to support a larger zcatalog) (2) Zope seems to loose chr(13) characters in certain compressed data sequences. To fix this problem we modified the compression routine to replace chr(13) characters with '*fixme*', then compressed/stored the data. On the decompression side, we decompressed the data and then replaced the *fixme* occurences with chr(13) characters and all was well (if you don't do this the decompression process results in corrupted data). Our current zodb size is about 7Gb and we are still investigating various methods for improving update speed and retrieval speed (we're not Google, so throwing 100,000 cpus at the problem is not an option!). How much data are you trying to store that would cause you to want to get into compression? HTH Jonathan