A question about __setstate__ in Shared/DC/ZRDB/Connection.py
Hi all. I'm working on an application which uses Zope (2.8, at the moment) and ZPsycopgDA (toghter with a number of other products). While writing an acceptance test, I encountered a strange problem: the test locks up. A further investigation shown that there were two connections at the database; one of them was not committed, the other one was blocked waiting for the other to commit. I therefore used the pdb in order to stop the execution of the test inside the connect method of the ZPsycopgDA.DA. Once I had that breakpoint, I was able to get the logs of the two transactions on the database, and I had the confirmation that indeed there were two different transactions. So, I wondered what could possibily happen, I mean why during a test there could be a second connect to the database. I issued a "bt" to see the stack of calls leading to the connect, and what I could see was that the coonect was called inside the __setstate__ method of Shared/DC/ZRDB/Connection.py. I assume therefore that the ZPsycopgDA object has been "ghostified", during the transaction. But this "assumption" is not supported by any evidence. In particular, it is not supported by my knowledge of the internal behaviour of ZODB on objects during a single transaction. Can anyone provide suggestion on this topic? Regards Marco -- Marco Bizzarri http://notenotturne.blogspot.com/ http://iliveinpisa.blogspot.com/
On Fri, Sep 19, 2008 at 9:23 AM, Marco Bizzarri <marco.bizzarri@gmail.com> wrote:
Hi all.
I'm working on an application which uses Zope (2.8, at the moment) and ZPsycopgDA (toghter with a number of other products).
While writing an acceptance test, I encountered a strange problem: the test locks up.
A further investigation shown that there were two connections at the database; one of them was not committed, the other one was blocked waiting for the other to commit.
I therefore used the pdb in order to stop the execution of the test inside the connect method of the ZPsycopgDA.DA. Once I had that breakpoint, I was able to get the logs of the two transactions on the database, and I had the confirmation that indeed there were two different transactions.
So, I wondered what could possibily happen, I mean why during a test there could be a second connect to the database.
I issued a "bt" to see the stack of calls leading to the connect, and what I could see was that the coonect was called inside the __setstate__ method of Shared/DC/ZRDB/Connection.py.
I assume therefore that the ZPsycopgDA object has been "ghostified", during the transaction. But this "assumption" is not supported by any evidence. In particular, it is not supported by my knowledge of the internal behaviour of ZODB on objects during a single transaction.
Can anyone provide suggestion on this topic?
Regards Marco -- Marco Bizzarri http://notenotturne.blogspot.com/ http://iliveinpisa.blogspot.com/
I did further investigation on the topic, and I think I've pinned the problem. I don't know the solution, but I can reproduce the problem with a small sample. Here is the sample: import os import sys import unittest if __name__ == '__main__': execfile(os.path.join(sys.path[0], '../framework.py')) from Testing import ZopeTestCase from OFS import Image from Products.ZPsycopgDA.DA import manage_addZPsycopgConnection from Products.ZSQLMethods import SQL class DoubleTransactionTest(ZopeTestCase.ZopeTestCase): def _add_big_image(self, value, data): Image.manage_addFile(self.app, "f%06s" % value, data , "a title") def test_showdouble(self): manage_addZPsycopgConnection(self.app, "db_connection", "", "host=localhost user=postgres dbname=template1") self.app._setObject('sql', SQL.SQL("sql", "", "db_connection", "", "select * from pg_tables")) self.app.sql() data = "*" * (1 << 20) for x in range(1000): self._add_big_image(x, data) print "Added %s " % x self.app.sql() if __name__ == '__main__': unittest.main() I'm doing three things here: - creating a db connection - making a query to the db (this causes a transaction to begin) - creating a lot of "big" files (expecially, larger than 2 * 2 ^ 16 *) - making another query to the db; Once I create a big file I fall into the following branch inside the OFS.Image._read_data if size <= 2*n: seek(0) if size < n: return read(size), size return Pdata(read(size)), size # Make sure we have an _p_jar, even if we are a new object, by # doing a sub-transaction commit. transaction.savepoint(optimistic=True) This causes, at the end, to call the ZODB.Connection.savepoint which, just before returning, calls a cacheGC to be called, which, I'm afraid, causes the db_connection to be "sent" out of the cache itself, thus leaving it without the _v_ attributes. Hope this can help in giving suggestions. Regards Marco -- Marco Bizzarri http://notenotturne.blogspot.com/ http://iliveinpisa.blogspot.com/
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Marco Bizzarri wrote:
On Fri, Sep 19, 2008 at 9:23 AM, Marco Bizzarri <marco.bizzarri@gmail.com> wrote:
Hi all.
I'm working on an application which uses Zope (2.8, at the moment) and ZPsycopgDA (toghter with a number of other products).
While writing an acceptance test, I encountered a strange problem: the test locks up.
A further investigation shown that there were two connections at the database; one of them was not committed, the other one was blocked waiting for the other to commit.
I therefore used the pdb in order to stop the execution of the test inside the connect method of the ZPsycopgDA.DA. Once I had that breakpoint, I was able to get the logs of the two transactions on the database, and I had the confirmation that indeed there were two different transactions.
So, I wondered what could possibily happen, I mean why during a test there could be a second connect to the database.
I issued a "bt" to see the stack of calls leading to the connect, and what I could see was that the coonect was called inside the __setstate__ method of Shared/DC/ZRDB/Connection.py.
I assume therefore that the ZPsycopgDA object has been "ghostified", during the transaction. But this "assumption" is not supported by any evidence. In particular, it is not supported by my knowledge of the internal behaviour of ZODB on objects during a single transaction.
Can anyone provide suggestion on this topic?
Regards Marco -- Marco Bizzarri http://notenotturne.blogspot.com/ http://iliveinpisa.blogspot.com/
I did further investigation on the topic, and I think I've pinned the problem. I don't know the solution, but I can reproduce the problem with a small sample. Here is the sample:
import os import sys import unittest
if __name__ == '__main__': execfile(os.path.join(sys.path[0], '../framework.py'))
from Testing import ZopeTestCase
from OFS import Image
from Products.ZPsycopgDA.DA import manage_addZPsycopgConnection from Products.ZSQLMethods import SQL
class DoubleTransactionTest(ZopeTestCase.ZopeTestCase):
def _add_big_image(self, value, data): Image.manage_addFile(self.app, "f%06s" % value, data , "a title")
def test_showdouble(self): manage_addZPsycopgConnection(self.app, "db_connection", "", "host=localhost user=postgres dbname=template1") self.app._setObject('sql', SQL.SQL("sql", "", "db_connection", "", "select * from pg_tables")) self.app.sql() data = "*" * (1 << 20) for x in range(1000): self._add_big_image(x, data) print "Added %s " % x self.app.sql()
if __name__ == '__main__': unittest.main()
I'm doing three things here:
- creating a db connection - making a query to the db (this causes a transaction to begin) - creating a lot of "big" files (expecially, larger than 2 * 2 ^ 16 *) - making another query to the db;
Once I create a big file I fall into the following branch inside the OFS.Image._read_data
if size <= 2*n: seek(0) if size < n: return read(size), size return Pdata(read(size)), size
# Make sure we have an _p_jar, even if we are a new object, by # doing a sub-transaction commit. transaction.savepoint(optimistic=True)
This causes, at the end, to call the ZODB.Connection.savepoint which, just before returning, calls a cacheGC to be called, which, I'm afraid, causes the db_connection to be "sent" out of the cache itself, thus leaving it without the _v_ attributes.
Hope this can help in giving suggestions.
Thanks for digging further into it; I couldn't imagine how that was occurring. In this case, the large number of created Pdata objects (one per 64k chunk of each of your images) are causing your connection object to be evicted from the cache at one of the savepoints, and thus ghostified (which is where it loses its volatiles). There is a special 'STICKY' state which prevents ghostifying, but it can't be set from Python code. You could, however, set '_p_changed' on the connection at the beginning of the method, and then delete it at the end: changed objects can't be ghostified. E.g.: def my_method(self): self.connection._p_changed = 1 try: self.sql() # now do the stuff which used to ghostify the connection finally: del self.connection._p_changed It is a nasty workaround, but should help prevent the lockup. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFI06oM+gerLs4ltQ4RAgq8AJ9yTjrFTtIt+IEPtghZIX/627IBjACeLG1f wm9dSVcCcB/wT5N4DXMumSw= =MI9F -----END PGP SIGNATURE-----
On Fri, Sep 19, 2008 at 3:33 PM, Tres Seaver <tseaver@palladion.com> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Marco Bizzarri wrote:
On Fri, Sep 19, 2008 at 9:23 AM, Marco Bizzarri <marco.bizzarri@gmail.com> wrote:
Hi all.
I'm working on an application which uses Zope (2.8, at the moment) and ZPsycopgDA (toghter with a number of other products).
While writing an acceptance test, I encountered a strange problem: the test locks up.
A further investigation shown that there were two connections at the database; one of them was not committed, the other one was blocked waiting for the other to commit.
I therefore used the pdb in order to stop the execution of the test inside the connect method of the ZPsycopgDA.DA. Once I had that breakpoint, I was able to get the logs of the two transactions on the database, and I had the confirmation that indeed there were two different transactions.
So, I wondered what could possibily happen, I mean why during a test there could be a second connect to the database.
I issued a "bt" to see the stack of calls leading to the connect, and what I could see was that the coonect was called inside the __setstate__ method of Shared/DC/ZRDB/Connection.py.
I assume therefore that the ZPsycopgDA object has been "ghostified", during the transaction. But this "assumption" is not supported by any evidence. In particular, it is not supported by my knowledge of the internal behaviour of ZODB on objects during a single transaction.
Can anyone provide suggestion on this topic?
Regards Marco -- Marco Bizzarri http://notenotturne.blogspot.com/ http://iliveinpisa.blogspot.com/
I did further investigation on the topic, and I think I've pinned the problem. I don't know the solution, but I can reproduce the problem with a small sample. Here is the sample:
import os import sys import unittest
if __name__ == '__main__': execfile(os.path.join(sys.path[0], '../framework.py'))
from Testing import ZopeTestCase
from OFS import Image
from Products.ZPsycopgDA.DA import manage_addZPsycopgConnection from Products.ZSQLMethods import SQL
class DoubleTransactionTest(ZopeTestCase.ZopeTestCase):
def _add_big_image(self, value, data): Image.manage_addFile(self.app, "f%06s" % value, data , "a title")
def test_showdouble(self): manage_addZPsycopgConnection(self.app, "db_connection", "", "host=localhost user=postgres dbname=template1") self.app._setObject('sql', SQL.SQL("sql", "", "db_connection", "", "select * from pg_tables")) self.app.sql() data = "*" * (1 << 20) for x in range(1000): self._add_big_image(x, data) print "Added %s " % x self.app.sql()
if __name__ == '__main__': unittest.main()
I'm doing three things here:
- creating a db connection - making a query to the db (this causes a transaction to begin) - creating a lot of "big" files (expecially, larger than 2 * 2 ^ 16 *) - making another query to the db;
Once I create a big file I fall into the following branch inside the OFS.Image._read_data
if size <= 2*n: seek(0) if size < n: return read(size), size return Pdata(read(size)), size
# Make sure we have an _p_jar, even if we are a new object, by # doing a sub-transaction commit. transaction.savepoint(optimistic=True)
This causes, at the end, to call the ZODB.Connection.savepoint which, just before returning, calls a cacheGC to be called, which, I'm afraid, causes the db_connection to be "sent" out of the cache itself, thus leaving it without the _v_ attributes.
Hope this can help in giving suggestions.
Thanks for digging further into it; I couldn't imagine how that was occurring. In this case, the large number of created Pdata objects (one per 64k chunk of each of your images) are causing your connection object to be evicted from the cache at one of the savepoints, and thus ghostified (which is where it loses its volatiles).
There is a special 'STICKY' state which prevents ghostifying, but it can't be set from Python code. You could, however, set '_p_changed' on the connection at the beginning of the method, and then delete it at the end: changed objects can't be ghostified. E.g.:
def my_method(self): self.connection._p_changed = 1 try: self.sql() # now do the stuff which used to ghostify the connection finally: del self.connection._p_changed
It is a nasty workaround, but should help prevent the lockup.
Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFI06oM+gerLs4ltQ4RAgq8AJ9yTjrFTtIt+IEPtghZIX/627IBjACeLG1f wm9dSVcCcB/wT5N4DXMumSw= =MI9F -----END PGP SIGNATURE-----
Thanks for the suggestion, Tres, I'm trying it right now. I think this could be responsible for the problem I had a few months ago, under the name: "Asking advice on a Zope "stuck" (or: what did I do wrong?)" Do you think there will be some sort of "general" solution to the problem? I mean, the problem is actually that there are some objects which should not be ghostified, or am I wrong? Regards Marco -- Marco Bizzarri http://notenotturne.blogspot.com/ http://iliveinpisa.blogspot.com/
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Marco Bizzarri wrote:
Thanks for the suggestion, Tres, I'm trying it right now.
I think this could be responsible for the problem I had a few months ago, under the name: "Asking advice on a Zope "stuck" (or: what did I do wrong?)"
Do you think there will be some sort of "general" solution to the problem? I mean, the problem is actually that there are some objects which should not be ghostified, or am I wrong?
There are two problems here: - Some objects need to be able to mark themselves as "sticky" for at least the duration of a transaction; my workaround is hackish, because if you omit the 'del conn._p_changed' it causes the object to be written needlessly; likewise, if the conn object *is* actually written to during the transaction, those changes will be discarded. - We need a way to keep the Pdata objects from evicting "precious" objects; ideally, Pdata instances would never be added to the cache at all. I worked a bit on a spike in which the Pdata iterator part would use a one-off connection with a zero-sized cache, but got stuck somewhere; maybe somebody else can make it work. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFI07Po+gerLs4ltQ4RAvz1AJ9kZ+XFucS9Eq4rFkGQ7NI12F9ItACeLE9/ YkOPDbgH5UbO+uHQ4IDIyHU= =y5iN -----END PGP SIGNATURE-----
Tres Seaver wrote at 2008-9-19 10:15 -0400:
... There are two problems here:
- We need a way to keep the Pdata objects from evicting "precious" objects; ideally, Pdata instances would never be added to the cache at all. I worked a bit on a spike in which the Pdata iterator part would use a one-off connection with a zero-sized cache, but got stuck somewhere; maybe somebody else can make it work.
But other large operations, too, can flush objects from the cache -- e.g. large scale "catalog_object". Thus, special treatment of "Pdata" can only reduce the risk but not remove it. -- Dieter
Tres Seaver wrote at 2008-9-19 09:33 -0400:
... There is a special 'STICKY' state which prevents ghostifying, but it can't be set from Python code. You could, however, set '_p_changed' on the connection at the beginning of the method, and then delete it at the end: changed objects can't be ghostified. E.g.:
def my_method(self): self.connection._p_changed = 1 try: self.sql() # now do the stuff which used to ghostify the connection finally: del self.connection._p_changed
Are you sure that this works? According to my (not very clear) memory, "_p_changed" in a C level attribute (that is definite) which could be set to "1" from application level but not reset (that is not sure). -- Dieter
On Sat, Sep 20, 2008 at 8:24 AM, Dieter Maurer <dieter@handshake.de> wrote:
Tres Seaver wrote at 2008-9-19 09:33 -0400:
... There is a special 'STICKY' state which prevents ghostifying, but it can't be set from Python code. You could, however, set '_p_changed' on the connection at the beginning of the method, and then delete it at the end: changed objects can't be ghostified. E.g.:
def my_method(self): self.connection._p_changed = 1 try: self.sql() # now do the stuff which used to ghostify the connection finally: del self.connection._p_changed
Are you sure that this works?
According to my (not very clear) memory, "_p_changed" in a C level attribute (that is definite) which could be set to "1" from application level but not reset (that is not sure).
-- Dieter _______________________________________________
As I said in my previous post, I modified my test case to check if this works, but I'm afraid it does not (i.e. I can still see two connections at the database). Regards Marco -- Marco Bizzarri http://notenotturne.blogspot.com/ http://iliveinpisa.blogspot.com/
Sorry to bother you guys: is there any suitable workaround for this? I tried the suggested one, but it does not work. Regards Marco -- Marco Bizzarri http://notenotturne.blogspot.com/ http://iliveinpisa.blogspot.com/
Marco Bizzarri wrote at 2008-9-19 09:23 +0200:
... I assume therefore that the ZPsycopgDA object has been "ghostified", during the transaction. But this "assumption" is not supported by any evidence. In particular, it is not supported by my knowledge of the internal behaviour of ZODB on objects during a single transaction.
Can anyone provide suggestion on this topic?
Cache garbage collection can happen at savepoint time. Then, volatile attributes can be lost. There is an age old proposal "http://wiki.zope.org/ZODB/VolatileAttributeLifetimeGarantee" which would allow to avoid this problem. As we use this feature since ages in our local Zope/ZODB version, I could provide an implementing patch (for ZODB 3.8). -- Dieter
On Sat, Sep 20, 2008 at 8:21 AM, Dieter Maurer <dieter@handshake.de> wrote:
Marco Bizzarri wrote at 2008-9-19 09:23 +0200:
... I assume therefore that the ZPsycopgDA object has been "ghostified", during the transaction. But this "assumption" is not supported by any evidence. In particular, it is not supported by my knowledge of the internal behaviour of ZODB on objects during a single transaction.
Can anyone provide suggestion on this topic?
Cache garbage collection can happen at savepoint time. Then, volatile attributes can be lost.
There is an age old proposal "http://wiki.zope.org/ZODB/VolatileAttributeLifetimeGarantee" which would allow to avoid this problem.
As we use this feature since ages in our local Zope/ZODB version, I could provide an implementing patch (for ZODB 3.8).
-- Dieter
I'm working right now with Zope 2.8, which I don't think is running on that version of ZODB; is is possible to backport such patch to Zope 2.8? I'm not asking to do the work, I'm just asking if, in theory, it is possible to do it, or if it relies on something which has been introduced in new releases of ZODB. Regards Marco -- Marco Bizzarri http://notenotturne.blogspot.com/ http://iliveinpisa.blogspot.com/
Marco Bizzarri wrote at 2008-9-20 08:41 +0200:
... I'm working right now with Zope 2.8, which I don't think is running on that version of ZODB; is is possible to backport such patch to Zope 2.8? I'm not asking to do the work, I'm just asking if, in theory, it is possible to do it, or if it relies on something which has been introduced in new releases of ZODB.
Sure. The proposal is age old -- and we are using the implementation since years -- also in Zope 2.8. It is a bit more difficult for me to provide a clean patch for the feature against a stock Zope 2.8 -- as beside this, we have many other modications/improvements and it is not easy to separate them. -- Dieter
participants (3)
-
Dieter Maurer -
Marco Bizzarri -
Tres Seaver