Paul Everitt wrote:
Jonathan wrote:
Idea: Put a proxying cache with content negotiation, rewriting of requests etc. in front of Zope.
[snip]
The problem is: what is a page?
[snip]
The win these days is in smarter caching which is more finely-grained than at the page level.
Exactly. If you're accustomed to thinking of a web site as a collection of HTML documents in a file system, squid/apache/roll-your-own caching sounds attractive. Your pages change only when you tell them to, which will be sporadically and infrequently (relative to page hits, if not absolutely). Once you really absorb the possibilities and practices of a tool like Zope or PHP, though, you begin to see your site as a collection of templates and interfaces, combining static snippets with data queries and views. It makes no more sense to cache many of the pages from such a site than it would to cache windows from an accounting program or word processor. You're moving into the land of the web application. As you take more and more advantage of the leverage that Zope gives you, you will realize that the *are* things you would like to cache, but they aren't pages any more. They're dribs and drabs of data, such as a menu generated by walking an object tree, or a database-driven bit of output which rarely changes but is relatively expensive to run. Typically, each such cacheable bit will be the value returned by a single method/object, possibly varying by parameters/context. Sometimes, as with the entry page of an active message board, you may want to provide customized views to each user, yet recompute the message summary only every minute or so, rather than with every hit. Other times, you may want to flush an item from the cache only if some 'upstream' object is modified. Then again, there are situations where only manual flushing will do, as it is impractical to try to automatically discover when the cache is stale. One way to handle this is to tag an object whose output you wish to cache with a set of rules, such as minimum or maximum cache lifetime, and to provide a 'flush from cache' method. Trying to automatically track dependencies is probably not workable, since acquisition and the request environment provide so many sources for variable data. On the other hand, with careful design, it may be possible to specify a set of values or a formula which can be used as a cache key for particular objects. This key could be computed by a user-defined method, or provided as an expression list if it's simple enough. Often several or many objects share common cache characteristics. They may depend in the same inputs, or simply have the same 'freshness' requirements. Rather than attach cache settings to particular objects, it might be a good idea to attach them to Cache Policy objects, and simply assign cacheable objects a Policy. This is roughly similar to the ZSQL Method/Database Connector division of labor. A single call to a Policy method could clear the cache of all objects with that Policy, and a cache key method/formula might only need to be calculated once. I'm not sure how Policies should best be assigned to objects. One way would be to provide Cache containers which encapsulate the objects to be cached. Another is to make some classes cache-aware, just as they can be ZCatalog-aware. Yet another is to provide Cache Manager objects, which can control cache Policy assignments for sibling objects. One of these days, I may care enough about performance to set this down in code, but not yet. Python underlies Zope, and it's philosophy on the subject is a good one: solve the problem now with clear, well-chosen algorithms and only worry about 'optimizing' if performance measurable suffers. If you try to guess where to optimize in advance, you'll probably waste your time and produce gnarly, bug-enhanced code. Cheers, Evan @ 4-am