At 04:08 PM 12/10/01 +0000, Tony McDonald wrote:
On 10/12/01 2:54 pm, "Phillip J. Eby" <pje@telecommunity.com> wrote:
I'm not sure if this is taken into consideration in your work so far/future plans... but just in case you were unaware, it is not necessary for you to persistently store objects in the ZODB that you intend to index in a ZCatalog. All that is required is that the object to be cataloged is accessible via a URL path. ZSQL methods can be set up to be URL-traversable, and to wrap a class around the returned row. To load the items into the catalog, you can use a PythonScript or similar to loop over a multi-row query, passing the objects directly to the catalog along with a path that matches the one they'll be retrievable from. This approach would eliminate the need for BTreeFolder altogether, although of course it requires access to the RDBMS for retrievals. This should reduce the number of writes and allow for bigger subtransactions in a given quantity of memory.
Gad! - are you saying you don't need to store a 1Mb .doc file into the ZODB, but can still index the thing, store the index information in the Zcatalog (presumably a lot smaller than 1Mb) and have the actual file accessible from a file system URL? If so, that's really neat!
Yep. By "URL path", though, I meant a *Zope* path. However it would be straightforward to create a Zope object that represents a filesystem path and does traversal/retrieval, assuming that one of the 'FS'-products out there doesn't already do this for you. Chris Withers has pointed out that technically you don't even need the path string to be valid, it just has to be unique. However, the standard tools and the method for getting the "real object" referred to by the catalog record do expect it to be a valid path IIRC. I personally find it most convenient, therefore, to use a real Zope path.