I have a couple hundred Articles (class instances) in my ZODB. Most instances are 200 kB, some are much larger. One of the Article properties is Number of times the article was read. Any increment causes ZODB to grow by the size of the given instance, due to the fact that ZODB will append a new version of the whole 200 kB or 2MB instance just because a single byte was changed. Consequently ZODB grows several hundred MB in a single day even if no new article is added. This calls for some form of non-undoable storage. One possibility is to move the "NumberOfReaders" attribute from ZODB to my SQL server. Not a problem, but I'm not sure what is the best way to maintain connection between ZODB instance (it can appear at different places in ZODB directory structure and it can be moved from place to place with Cut and Paste in ZMI) and SQL table. Connection means SQL key column. Instance ids cannot be used as a SQL key, because they are nonunique - each folder has a sequence of instances numbered 1...x. So there are over ten instances named "1". Instance addresses = URLs could be used but I would have to write my own methods / interfaces for moving Articles in the directory structure to maintain the connection between ZODB and SQL table. Is there an obvious elegant solution that I am missing? -- Milos Prudek
On Mon, Apr 18, 2005 at 02:38:01PM +0200, Milos Prudek wrote:
I have a couple hundred Articles (class instances) in my ZODB. Most instances are 200 kB, some are much larger. One of the Article properties is Number of times the article was read. Any increment causes ZODB to grow by the size of the given instance, due to the fact that ZODB will append a new version of the whole 200 kB or 2MB instance just because a single byte was changed.
Consequently ZODB grows several hundred MB in a single day even if no new article is added.
This calls for some form of non-undoable storage.
One possibility is to move the "NumberOfReaders" attribute from ZODB to my SQL server. Not a problem, but I'm not sure what is the best way to maintain connection between ZODB instance (it can appear at different places in ZODB directory structure and it can be moved from place to place with Cut and Paste in ZMI) and SQL table. Connection means SQL key column. Instance ids cannot be used as a SQL key, because they are nonunique - each folder has a sequence of instances numbered 1...x. So there are over ten instances named "1". Instance addresses = URLs could be used but I would have to write my own methods / interfaces for moving Articles in the directory structure to maintain the connection between ZODB and SQL table.
Is there an obvious elegant solution that I am missing?
You might consider replacing the NumberOfReaders attribute with a first-class persistent object. e.g. you could use a PersistentList or IIBTree or some such. This would prevent Zope from saving the entire Article object when that one attribute changes. You'd still get some bloat from all the historical revisions of the attribute, but it would be MUCH less. e.g. something like: from ZODB import PersistentList class Article(...): def __init__(self): ... self._numberOfReaders = PersistentList([0]) def getNumberOfReaders(self): return self._numberOfReaders[0] def incrementNumberOfReaders(self): # do NOT set self._p_changed = 1 self._numberOfReaders[0] = += 1 You could then have NumberOfProperties become a ComputedAttribute to allow client code to keep accessing it as an attribute. (this is a read-only "property" in recent versions of Python, not to be confused with Zope's "properties"; but Zope doesn't support pythonic properties yet.) e.g.: NumberOfReaders= ComputedAttribute(getNumberOfReaders) But AFAIK, ComputedAttributes don't support write methods. So client code can't write "someArticle.NumberOfReaders += 1". But then, you wouldn't be able to do that with a SQL-based solution either. -- Paul Winkler http://www.slinkp.com
Paul Winkler <pw_lists@slinkp.com> wrote:
On Mon, Apr 18, 2005 at 02:38:01PM +0200, Milos Prudek wrote:
I have a couple hundred Articles (class instances) in my ZODB. Most instances are 200 kB, some are much larger. One of the Article properties is Number of times the article was read. Any increment causes ZODB to grow by the size of the given instance, due to the fact that ZODB will append a new version of the whole 200 kB or 2MB instance just because a single byte was changed.
Consequently ZODB grows several hundred MB in a single day even if no new article is added.
This calls for some form of non-undoable storage.
Is there an obvious elegant solution that I am missing?
You might consider replacing the NumberOfReaders attribute with a first-class persistent object. e.g. you could use a PersistentList or IIBTree or some such. This would prevent Zope from saving the entire Article object when that one attribute changes. You'd still get some bloat from all the historical revisions of the attribute, but it would be MUCH less.
A better candidate, rather than PersistentList, would be a BTrees.Length. Florent -- Florent Guillaume, Nuxeo (Paris, France) CTO, Director of R&D +33 1 40 33 71 59 http://nuxeo.com fg@nuxeo.com
A better candidate, rather than PersistentList, would be a BTrees.Length.
Never heard about this. Why is it better? What does it do? Where can I find more info about it? -- Milos Prudek http://www.spoxdesign.com - your web usability testing
Milos Prudek <prudek@bvx.cz> wrote:
A better candidate, rather than PersistentList, would be a BTrees.Length.
Never heard about this. Why is it better? What does it do? Where can I find more info about it?
It's designed to be a counter and does automatic conflict resolution. It was mentionned a number of times in the past on the zope and zodb-dev lists. Florent -- Florent Guillaume, Nuxeo (Paris, France) CTO, Director of R&D +33 1 40 33 71 59 http://nuxeo.com fg@nuxeo.com
It's designed to be a counter and does automatic conflict resolution. It was mentionned a number of times in the past on the zope and zodb-dev lists.
Thank you. I found detailed info in the mailing list archive, just like you said. -- Milos Prudek http://www.spoxdesign.com - your web usability testing
from ZODB import PersistentList
Interesting. I'll look at this idea. I never heard of this class.
You could then have NumberOfProperties become a ComputedAttribute
But AFAIK, ComputedAttributes don't support write methods.
Um, are you saying that the solution would actually not work for my use case?
So client code can't write "someArticle.NumberOfReaders += 1". But then, you wouldn't be able to do that with a SQL-based solution either.
Why not? It's easy to write ZSQL method to update data. Probably we do not understand each other. -- Milos Prudek
On Mon, Apr 18, 2005 at 06:29:49PM +0200, Milos Prudek wrote:
from ZODB import PersistentList
Interesting. I'll look at this idea. I never heard of this class.
You could then have NumberOfProperties become a ComputedAttribute
But AFAIK, ComputedAttributes don't support write methods.
Um, are you saying that the solution would actually not work for my use case?
If your use case is that a writable attribute is part of your class' API, then yes I am saying that. If your count is always updated by methods of the class anyway, then you're fine.
So client code can't write "someArticle.NumberOfReaders += 1". But then, you wouldn't be able to do that with a SQL-based solution either.
Why not? It's easy to write ZSQL method to update data. Probably we do not understand each other.
Probably not. If you had an SQL solution, and some client code said someArticle.NumberOfReaders += 1, how would you get that assignment to fire off your SQL code? -- Paul Winkler http://www.slinkp.com
Probably not. If you had an SQL solution, and some client code said someArticle.NumberOfReaders += 1, how would you get that assignment to fire off your SQL code?
I would have to "calculate" the key, so simple assignment would not work, that's true. -- Milos Prudek http://www.spoxdesign.com - your web usability testing
participants (3)
-
Florent Guillaume -
Milos Prudek -
Paul Winkler