[Zope-dev] Re: Caching ZCatalog results
Roché Compaan
roche at upfrontsystems.co.za
Sun Feb 25 04:48:15 EST 2007
On Sat, 2007-02-24 at 09:48 +0100, Dieter Maurer wrote:
> Roché Compaan wrote at 2007-2-23 22:00 +0200:
> > ...
> >Thanks for that pointer. It's good that way, it should make invalidation
> >easier. It could be as simple as invalidating any cached result that
> >contains the documentId being indexed. Do you see any problem with the
> >following invalidation strategy:
> >
> >If the 'documentId' exists (cataloging existing object), invalidate all
> >cached result sets that contain the documentId.
> >
> >If the 'documentId' doesn't exist (cataloging new object), invalidate
> >all result sets where the ids of indexes applied, are contained in the
> >cache key for that result set.
>
> I see several problems:
>
> * the RAMCacheManager does not provide an API to implement
> this policy
>
> * a cache manager would need a special data structure
> to efficiently implement the policy (given a documentId,
> find all cached results containing the documentId).
Can you elaborate. Would and IISet be efficient?
> * Apparently, your cache key contains the indexes involved
> in producing the result.
This is coincidental. I'm building a cache key from all arguments passed
in as keyword arguments and on the REQUEST.
> The problem with this is that these indexes are known
> only after the query has been performed:
>
> The catalog API allows indexes to respond to subqueries,
> that do not contain their own name.
>
> I use this feature to allow a "Managable RangeIndex"
> to transparently replace "effective, expires" queries.
>
> But otherwise, the feature is probably not used
> intensively.
If these parameters are on the request or in keywords they will form
part of the cache key.
> Of course, you can add the information *after*
> the query has been performed and use it for invalidation -- in
> a specialized cache manager.
>
>
> On the other hand, new objects are usually indexed with
> all available (and not only a few) indexes.
>
> While some of the indexes may not be able to determine
> a senseful value for the document, the standard indexes
> have problems to handle this properly ("ManagableIndex"es can)
> and the API does not propagate the information.
I think it will not be trivial to implement invalidation that doesn't
bite you. I thought of checking for document ids because invalidating
results when a whole index changes might cause to many invalidations.
For example, querying for the same UID of an object should yield a
cached result most of the times. Indexing a new object's UID shouldn't
invalidate the cached results for existing UID queries.
Let's assume we have a specialised cache manager and a cache that copes
with the subtleties of sub queries, do think that the invaliding the
cache according to the logic I suggested would work? Can you think of
cases where it can lead to stale results that one should guard against.
--
Roché Compaan
Upfront Systems http://www.upfrontsystems.co.za
More information about the Zope-Dev
mailing list