Cache code in Shared/DC/ZRDB/DA.py
Hi Jim, I'm wondering if you can still remember the rational behind the cache code at around lines 355-387 of: http://svn.zope.org/Zope/trunk/lib/python/Shared/DC/ZRDB/DA.py?rev=68158&vie... This code is pretty old (checked in 5th Dec 1997) but has started causing a few people problems under high load: http://mail.zope.org/pipermail/zope-db/2006-September/004684.html http://www.zope.org/Collectors/Zope/2212 In particular: - in line 368, why is len(cache)>max_cache/2 used as a trigger to start cache clearing? (the /2 in particular) - does it matter that IOBTree.Bucket has gone away and that tcache is now a simple dictionary? It certainly seems to make the keys.reverse() on line 370 superfluous and the keys[-1]<t on line 371 less reliable. More generally, do you or does anyone else have any attachment to this code or would anyone mind if I ripped it out and replaced it with something simpler, with more comments and unit tests? cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
Chris Withers schrieb:
Hi Jim,
I'm wondering if you can still remember the rational behind the cache code at around lines 355-387 of:
http://svn.zope.org/Zope/trunk/lib/python/Shared/DC/ZRDB/DA.py?rev=68158&vie...
This code is pretty old (checked in 5th Dec 1997) but has started causing a few people problems under high load:
http://mail.zope.org/pipermail/zope-db/2006-September/004684.html http://www.zope.org/Collectors/Zope/2212
In particular:
- in line 368, why is len(cache)>max_cache/2 used as a trigger to start cache clearing? (the /2 in particular)
- does it matter that IOBTree.Bucket has gone away and that tcache is now a simple dictionary? It certainly seems to make the keys.reverse() on line 370 superfluous and the keys[-1]<t on line 371 less reliable.
More generally, do you or does anyone else have any attachment to this code or would anyone mind if I ripped it out and replaced it with something simpler, with more comments and unit tests?
I have a replacement started, which uses Cacheable mixin instead. Advantage is, with RAM Cache Manager you can see the hits your method cause and invalide the cache per object. I also introduced a execution time treshold, so you can configure to only cache runtimes say above 1 sec, leaving more room for long running queries. I think I should packe up what I have now (rough edges) so you can have a look at it. Tino.
Tino Wildenhain wrote:
I have a replacement started, which uses Cacheable mixin instead. Advantage is, with RAM Cache Manager you can see the hits your method cause and invalide the cache per object. I also introduced a execution time treshold, so you can configure to only cache runtimes say above 1 sec, leaving more room for long running queries.
I think I should packe up what I have now (rough edges) so you can have a look at it.
I have to be honest, I'm looking to simplify, and this sounds a LOT more complicated... Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
Chris Withers schrieb:
Tino Wildenhain wrote:
I have a replacement started, which uses Cacheable mixin instead. Advantage is, with RAM Cache Manager you can see the hits your method cause and invalide the cache per object. I also introduced a execution time treshold, so you can configure to only cache runtimes say above 1 sec, leaving more room for long running queries.
I think I should packe up what I have now (rough edges) so you can have a look at it.
I have to be honest, I'm looking to simplify, and this sounds a LOT more complicated...
No actually it isnt. I ripped out a lot of the old code. All other stuff is easily handled by the cache code anyway. ZSQL Methods and all that stuff is horribly old and full of calamities. Apart from the cache management nightmare, there is also a lot of ZMI stuff really wrong. You dont see the right encoding because the test pages ignore it, you have a "pretty formatter" for column names - hiding their real name and you always have to wonder the name of the variable you get in the result set. ("fOO_bar" becomes "FOO bar" for example) and so on :-) I think I just upload what I have so you can have a look at it :-)
Chris Withers schrieb:
Tino Wildenhain wrote:
I have a replacement started, which uses Cacheable mixin instead. Advantage is, with RAM Cache Manager you can see the hits your method cause and invalide the cache per object. I also introduced a execution time treshold, so you can configure to only cache runtimes say above 1 sec, leaving more room for long running queries.
I think I should packe up what I have now (rough edges) so you can have a look at it.
I have to be honest, I'm looking to simplify, and this sounds a LOT more complicated...
And here it is (maybe it takes a couple of minutes to be visible) http://www.zope.org/Members/tino/CachedZSQLMethods http://www.zope.org/Members/tino/CachedZSQLMethods/CachedZSQLMethodsPrerelea... Greets Tino
Chris Withers wrote:
I have to be honest, I'm looking to simplify, and this sounds a LOT more complicated...
Okay, I've written unit tests and refactored the code on the 2.9 branch, 2.10 branch and trunk. Please let me know if you have any problems... cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
Chris Withers schrieb:
Chris Withers wrote:
I have to be honest, I'm looking to simplify, and this sounds a LOT more complicated...
Okay, I've written unit tests and refactored the code on the 2.9 branch, 2.10 branch and trunk.
Please let me know if you have any problems...
Uh oh - which code? :-) Greets Tino
Chris Withers wrote at 2006-11-16 17:29 +0000:
.... I'm wondering if you can still remember the rational behind the cache code at around lines 355-387 of:
http://svn.zope.org/Zope/trunk/lib/python/Shared/DC/ZRDB/DA.py?rev=68158&vie...
This code is pretty old (checked in 5th Dec 1997) but has started causing a few people problems under high load:
http://mail.zope.org/pipermail/zope-db/2006-September/004684.html http://www.zope.org/Collectors/Zope/2212
In particular:
- in line 368, why is len(cache)>max_cache/2 used as a trigger to start cache clearing? (the /2 in particular)
You may want to start getting rid of keys old enough that you would not use them anyway already before you reached max cache. I do not know why that is not done always -- an "XYBucket" has an efficient method to determine the minimal key, thus we could get rid of the "len(cache) > max_cache /2". And of course, it is quite stupid to determine the keys and reverse them, even when no key is old enough....
- does it matter that IOBTree.Bucket has gone away and that tcache is now a simple dictionary? It certainly seems to make the keys.reverse() on line 370 superfluous and the keys[-1]<t on line 371 less reliable.
Yes. You want to kill the dict and use the "IOBucket"...
More generally, do you or does anyone else have any attachment to this code or would anyone mind if I ripped it out and replaced it with something simpler, with more comments and unit tests?
Very good idea. If you are at it. The idea to have this cache maintained via a "_v_" attribute is insane. A module level cache would be much more efficient and save memory as well. You may look at my "CCZSQLMethods"... -- Dieter
Dieter Maurer wrote:
- in line 368, why is len(cache)>max_cache/2 used as a trigger to start cache clearing? (the /2 in particular)
You may want to start getting rid of keys old enough that you would not use them anyway already before you reached max cache.
Yeah, I've changed some of the other code now so that when an entry is found to be stale, it's dumped out of the cache immediately.
And of course, it is quite stupid to determine the keys and reverse them, even when no key is old enough....
Yes, 2 things: - The original Bucket class used died so long ago that just a simple dict was being used, and this has no meaningful order. - there's no need to delete the keys anyway, just delete from the other end.
- does it matter that IOBTree.Bucket has gone away and that tcache is now a simple dictionary? It certainly seems to make the keys.reverse() on line 370 superfluous and the keys[-1]<t on line 371 less reliable.
Yes. You want to kill the dict and use the "IOBucket"...
Actually I used OOBucket. The silly casting-time-to-int just so it could be used as the key for an IOBucket was a source of 2 memory leaks.
If you are at it.
The idea to have this cache maintained via a "_v_" attribute is insane. A module level cache would be much more efficient and save memory as well.
Can you explain this a bit more? I'm unlikely to have further time to do this as well, but you're welcome to pick up where I left off... cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
Chris Withers wrote at 2006-11-20 07:46 +0000:
... Actually I used OOBucket. The silly casting-time-to-int just so it could be used as the key for an IOBucket was a source of 2 memory leaks.
If you are at it.
The idea to have this cache maintained via a "_v_" attribute is insane. A module level cache would be much more efficient and save memory as well.
Can you explain this a bit more? I'm unlikely to have further time to do this as well, but you're welcome to pick up where I left off...
I will not do something with ZSQLMethods -- as there is my "CCZSQLMethods". Anyone that wants sane ZSQLMethods can use "CCZSQLMethods"... -- Dieter
participants (3)
-
Chris Withers -
Dieter Maurer -
Tino Wildenhain