[ZODB-Dev] Re: ZODB Caching Questions
Toby Dickenson
tdickenson@geminidataloggers.com
Fri, 22 Mar 2002 11:08:19 +0000
On Friday 22 March 2002 10:41 am, Martin Gfeller wrote:
>last October, I sent some questions about ZODB caching, but never got
>any answer. As it is still important to us, and I've noticed some recent
>
>traffic and work in this area, especially from Toby Dickenson,
>I'd like to ask you again:
>
>We're using a number of ZODB databases to store financial deal objects
>and assorted static data objects.
>
>A reference to each object in a database is kept in a 'root' object,
>which is a PersistentMapping.
>1. If objects are referenced from a root object, they can never
> be deallocated (just deactivated), because at least one
> reference is always kept. Is this so,
correct
> and what are alternative
> ways to do it?
This will be bad using the original cache implementation, because it doesnt
distinguish between ghost (deactivated) and non-ghost objects. It will always
see that the cache size is 100001 (100000 deal objects plus the root. maybe
some more too), theres nothing it can do to reduce the number, so it will
thrash. To help this you need to use a BTree instead of PersistentMapping,
However this may still not perform satisfactorily using the old cache.
Under my new cache your current implementation may 'just work'. The cache
controls the number of non-ghost objects. Lets say you set the target size to
be 500. It will keep the 500 most recently used objects activated, and 99500
as ghosts. No thrashing.
Ghosts are tiny, but their overhead is not zero. Maybe you need to consider a
BTree anyway. I will investigate further if BTree+my cache is not a complete
solution.
>2. If an object is a simple Python object, instead of being derived
> from Persistent, cache control never seems to touch it.
> Is this correct?
Yes. It will be persisted inside every persistent object that references it,
and removed from memory when the last reference to it is lost.
>3. The cache statistics cache_mean_deal and cache_mean_deac never
> seem to show anything else than 0.0, despite tracing shows that
> deactiviations occur.
I dont think those stats ever recorded a useful metric.
One hack I found useful when cache-tuning my application is to change the
cacheGC and _incrgc functions in Connection.py to record the cache sizes
before and after each pack.
Under my new cache, the size after will always be the target size. The
difference (before-target) is the number of objects activated since the last.
Plotting this as a historgram is a good illustration of memory pressure.
(Hmmm; I might tidy up this patch for Zope 2.6 if I have time)
>4. If we have PersistentMappings of up to 100'000 entries, indexed
> by a (string,string) tuple, should we use a Btree instead?
See, you already knew the answer.