[Zope-dev] How To Improve Cache Coherency for RAM/Disk Cache Manager...?

Craeg K Strong cstrong@arielpartners.com
Mon, 03 Mar 2003 16:32:04 -0500


Hello:

I am getting ready to release the next version of XMLTransform, and in
revisiting the Caching strategy for the product, I realized there are larger
issues that probably deserve a discussion here. 

The bottom line is that transforming XML to something else via
XSLT is a potentially expensive operation, so that caching the results
is often worthwhile.

As I thought about the problem, I realized that this probably holds true
for any sufficiently dynamic site where you employ caching because:

- the cost of processing exceeds the cost of retrieval from cache by at 
least
an order of magnitude
- there are many more readers than writers

Question:  How can we ensure cache coherency?

For example, you might have a ZPT that includes the results of several
long-running PythonScripts, whose rendered result is cached. 
What happens when the code for those
PythonScripts changes?  Worse, what happens when the SQL data
retrieved by the Z SQL Method that the PythonScripts operate on
changes?

Correct me if I am wrong, but today the Zope Cache Management facility
takes into account changes in cached objects, but not objects on which they
depend.

One strategy for accounting for this problem is to invalidate objects in 
the cache
based on a certain interval.  That way objects are out of date for at 
most the
length of the interval.  This could be called the "pull" or reactive model.

Alternatively, Cacheable objects might be somehow aware of the objects 
on which
they depend, and invalidate themselves in the cache when one of their 
dependent
objects changes.  This could be called the "push" or proactive model.  
Depending on
some parameters, they might even recalculate their results proactively 
so they
could be re-cached immediately.  Why make the unlucky user pay the price?

The latter alternative is not infeasible.   DTML and ZPT scripts must parse
their contents in order to render, so the information is available 
somewhere.

The question is: can/should this be addressed for Zope2?  What about Zope3?
Is one model more appropriate for a development setting (equal numbers 
of writers and
readers) vs production (many more readers, few or no writers)?

Any thoughts or sage advice on this topic would be much appreciated!

Regards,

--Craeg