Hi Currently I have some problems with our application (Zope2.8.4) and with Conflict Errors in sessions. In general if we have few concurrent requests that are running sometimes for 3-4 minutes (and they're touching session inside) I get a lot of conflict errors with Inceraser, OOBTree, Length2 etc. These are errors like: serial starded with xxx serial currently commited xxxx Of course I know that it is best to move such long processes to something external with Async, lovely.Remotetask or built in Oracle jobs, but so far I have to deal with this state of our application (as I'm not the author ot this but rather something like an zope admin/supporter for this app). I think that ConflictErrors are caused because of sessions implementation, especially: timeslices, "current" bucket etc. as written in Transience/HowTransienceWorks.stx. Changing session_resolution_seconds to big value helps here. By default we had session_resolution_seconds set to 300. I wonder how this happens. If I have two requests that at the beginning modify it's sessions like: session.set('aa', 1), and then call long running ZSQLMethods then (if meanwhile timeslice has changed because of too short session_resolution_seconds): - first request that finishes finds that there is new 'current' bucket and moves it's session object and second request's session object to new 'current' bucket and commits this - second request finishes and finds that it's session object is not the same as it was at the beginning (because it was moved to 'current' bucket)?? Anybody can say if I'm right here? I also tried Faster product to manage sessions but this behaves in a similiar way (I mean causes conflict errors in such situation). I found 'mcdutils' too. Tres Seaver said on zope-dev (a long time ago) that it is supposed to have no conflict errors: http://mail.zope.org/pipermail/zope-dev/2006-May/027555.html Mcdutils: http://agendaless.com/Members/tseaver/software/mcdutils There is only 0.1 version. What is it's current state? Seems to be dead? Can I take a look at this or is this better to not even touch that? -- Maciej Wisniowski
On 4/12/07, Maciej Wisniowski <maciej.wisniowski@coig.katowice.pl> wrote:
Hi
Currently I have some problems with our application (Zope2.8.4) and with Conflict Errors in sessions. In general if we have few concurrent requests that are running sometimes for 3-4 minutes (and they're touching session inside) I get a lot of conflict errors with Inceraser, OOBTree, Length2 etc. These are errors like:
serial starded with xxx serial currently commited xxxx
Of course I know that it is best to move such long processes to something external with Async, lovely.Remotetask or built in Oracle jobs, but so far I have to deal with this state of our application (as I'm not the author ot this but rather something like an zope admin/supporter for this app).
I think that ConflictErrors are caused because of sessions implementation, especially: timeslices, "current" bucket etc. as written in Transience/HowTransienceWorks.stx. Changing session_resolution_seconds to big value helps here. By default we had session_resolution_seconds set to 300.
You could keep experimenting with values to reduce the chances of conflicts. Perhaps sessions that last for days. With resolution of hours. Disabling inband housekeeping. Note that a session-timeout-minutes of 0 enables a slightly different approach which has a little less "active" structure.
I wonder how this happens. If I have two requests that at the beginning modify it's sessions like: session.set('aa', 1), and then call long running ZSQLMethods then (if meanwhile timeslice has changed because of too short session_resolution_seconds):
- first request that finishes finds that there is new 'current' bucket and moves it's session object and second request's session object to new 'current' bucket and commits this
- second request finishes and finds that it's session object is not the same as it was at the beginning (because it was moved to 'current' bucket)??
Anybody can say if I'm right here?
I don't think session mechanics operates like that at the end of a transaction. More generally what is happening is that the second transaction is trying to commit data that was changed by an earlier transaction after second transaction read that data. In this case the data is various bits of the internals that make up sessions and transience storage. Very careful use of explicit transaction commits may be all that you need in your application. For example, make all your edits of the session data early in the request and then commit the transaction. Then proceed with the longer operation. Might be that destroys the consistency of your application data though. -- Michael
You could keep experimenting with values to reduce the chances of conflicts. Perhaps sessions that last for days. With resolution of hours. Disabling inband housekeeping.
Note that a session-timeout-minutes of 0 enables a slightly different approach which has a little less "active" structure. Yes, setting high values for timeout and session resolution seconds or disabling session timeout by setting it to '0' reduces rate of conflict. I tried disabling inband housekeeping but this didn't helped in this case.
I don't think session mechanics operates like that at the end of a transaction. More generally what is happening is that the second transaction is trying to commit data that was changed by an earlier transaction after second transaction read that data. In this case the data is various bits of the internals that make up sessions and transience storage.
Right but I would like to know how exactly this goes, eg. when I can expect conflicts. So far I'm still not sure when and why conflict will appear. Thanks for the answer -- Maciej Wisniowski
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Maciej Wisniowski wrote:
Hi
Currently I have some problems with our application (Zope2.8.4) and with Conflict Errors in sessions. In general if we have few concurrent requests that are running sometimes for 3-4 minutes (and they're touching session inside) I get a lot of conflict errors with Inceraser, OOBTree, Length2 etc. These are errors like:
serial starded with xxx serial currently commited xxxx
Of course I know that it is best to move such long processes to something external with Async, lovely.Remotetask or built in Oracle jobs, but so far I have to deal with this state of our application (as I'm not the author ot this but rather something like an zope admin/supporter for this app).
I think that ConflictErrors are caused because of sessions implementation, especially: timeslices, "current" bucket etc. as written in Transience/HowTransienceWorks.stx. Changing session_resolution_seconds to big value helps here. By default we had session_resolution_seconds set to 300.
I wonder how this happens. If I have two requests that at the beginning modify it's sessions like: session.set('aa', 1), and then call long running ZSQLMethods then (if meanwhile timeslice has changed because of too short session_resolution_seconds):
- first request that finishes finds that there is new 'current' bucket and moves it's session object and second request's session object to new 'current' bucket and commits this
- second request finishes and finds that it's session object is not the same as it was at the beginning (because it was moved to 'current' bucket)??
Anybody can say if I'm right here?
I also tried Faster product to manage sessions but this behaves in a similiar way (I mean causes conflict errors in such situation).
I found 'mcdutils' too. Tres Seaver said on zope-dev (a long time ago) that it is supposed to have no conflict errors: http://mail.zope.org/pipermail/zope-dev/2006-May/027555.html
Mcdutils: http://agendaless.com/Members/tseaver/software/mcdutils
There is only 0.1 version. What is it's current state? Seems to be dead? Can I take a look at this or is this better to not even touch that?
'mcdutils' is in a "worked for me under light load" state; I never ended up deploying it in production. Because it is not based on the ZODB, it is certainly not going to raise any ZODB conflict errors. However, using memcache as the only backing store for your data is *not* a recommended practice by the memcache developers: they designed it as a *cache*, not an atomic storage. I dropped further development on it once I evaluated the cost of having session data disappear (or become inaccessible) when new memcache servers were added, or old ones removed. I *do* use 'faster' in production, with the session storage mounted across ZEO (a configuration which nobody in their right mind would do with the standard session storage). Some conflicts are possible in the default configuration, although I never see them in practice at the user level. However, that applicaiton does *not* set up multiple long-running transactions which attempt to mutate the same session data. If you are *sharing* mutable session data between multiple long-running requests, and expect to have no conflicts, you are in for a disappointment: unless your application can supply resolution logic, you *want* conflict errors in such cases. If you believe that 'faster' is raising conflicts due its own internal data structures (OOBTree bucket splits), rather than in the application-dveined session data, there is a conflict-free alternative available: we found that it was slower than the other, and therefore didn't scale as well, even given the possibility of conflicts. To enable the conflict-free storage, you need to patch the '_BUCKET_TYPE' class-level variable of the storage to use 'AppendOnlyDict' rather than 'OOBTree'. E.g.: from Products.faster.sessiondata import CBSessionDataContainer from Products.faster.appendict import AppendOnlyDict CBSessionDataContainer._BUCKET_TYPE = AppendOnlyDict I had actually meant to make that a zope.conf-tweakable setting, but never got around to it. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGHbFO+gerLs4ltQ4RAk2UAJ9mFgY3OX4KZAK4ztGQwLMJX9MIEQCeJMfW 88/fKc3zcm4gi3efM15rJmg= =fRak -----END PGP SIGNATURE-----
I dropped further development on it once I evaluated the cost of having session data disappear (or become inaccessible) when new memcache servers were added, or old ones removed. Thanks for clearing that.
If you are *sharing* mutable session data between multiple long-running requests, and expect to have no conflicts, you are in for a disappointment: unless your application can supply resolution logic, you *want* conflict errors in such cases. I'm not sharing session data between requests. I have two disctinct requests (eg. from different browsers) and I expect them to have it's own session objects. So if request 1 puts something into it's session I would expect that it has nothing to second request's session, but this is not always true (I think when new timeslice appeared meanwhile). More, it is not even necessary to change session data! Even just calling self.REQUEST.SESSION causes conflicts.
Simple test I did is to create external method like: import time def testme(self): print 'testme started' self.REQUEST.SESSION time.sleep(5) print 'testme after sleep' return 'finished' and call this manually from two different browsers. It DOESN'T change session data! I have Zope 2.9 instance with default setting of session resolution seconds == 20. I started zopectl fg with logger level set to debug. after calling my external method I see output like below: ##### Default Zope session: I entered localhost:8081/testme from two browsers ########## 2007-04-12 08:09:03 DEBUG txn.-1276576864 new transaction testme started 2007-04-12 08:09:03 DEBUG txn.-1251398752 new transaction testme started testme after sleep 2007-04-12 08:09:08 DEBUG txn.-1276576864 commit <Connection at b588ea8c> 2007-04-12 08:09:08 DEBUG txn.-1276576864 commit testme after sleep 2007-04-12 08:09:08 INFO ZPublisher.Conflict ConflictError at /testme: database conflict error (oid 0x0d, class BTrees._OOBTree.OOBTree, serial this txn started with 0x036ce3f0fca76611 2007-04-12 06:08:59.215758, serial currently committed 0x036ce3f122de5f11 2007-04-12 06:09:08.172337) (1 conflicts (0 unresolved) since startup at Thu Apr 12 08:08:51 2007) 2007-04-12 08:09:09 DEBUG txn.-1251398752 abort 2007-04-12 08:09:09 DEBUG txn.-1251398752 new transaction testme started testme after sleep 2007-04-12 08:09:14 DEBUG txn.-1251398752 commit <Connection at b6307e8c> 2007-04-12 08:09:14 DEBUG txn.-1251398752 commit ################################## With Faster (Resolution (seconds): 20) and time.sleep(5) in my external method I had no conflicts but after changing to time.sleep(25) I get: ################################## 2007-04-12 08:29:44 DEBUG txn.-1268184160 new transaction testme started 2007-04-12 08:29:45 DEBUG txn.-1276576864 new transaction testme started testme after sleep 2007-04-12 08:30:09 DEBUG txn.-1268184160 commit <Connection at b588ea8c> 2007-04-12 08:30:09 DEBUG txn.-1268184160 commit testme after sleep 2007-04-12 08:30:10 INFO ZPublisher.Conflict ConflictError at /testme: database conflict error (oid 0x2d, class Products.faster.sessiondata.SessionDataContainer, serial this txn started with 0x036ce4051fa32844 2007-04-12 06:29:07.415000, serial currently committed 0x036ce406266df400 2007-04-12 06:30:09.006915) (4 conflicts (0 unresolved) since startup at Thu Apr 12 08:08:51 2007) 2007-04-12 08:30:11 DEBUG txn.-1276576864 abort 2007-04-12 08:30:11 DEBUG txn.-1276576864 new transaction testme started testme after sleep 2007-04-12 08:30:36 DEBUG txn.-1276576864 commit <Connection at b6307e8c> 2007-04-12 08:30:36 DEBUG txn.-1276576864 commit As you see there is even no change to session data from external method.
If you believe that 'faster' is raising conflicts due its own internal data structures (OOBTree bucket splits), rather than in the application-dveined session data, there is a conflict-free alternative available: we found that it was slower than the other, and therefore didn't scale as well, even given the possibility of conflicts. Thanks for that clue.
-- Maciej Wisniowski
If you believe that 'faster' is raising conflicts due its own internal data structures (OOBTree bucket splits), rather than in the application-dveined session data, there is a conflict-free alternative available: we found that it was slower than the other, and therefore didn't scale as well, even given the possibility of conflicts.
To enable the conflict-free storage, you need to patch the '_BUCKET_TYPE' class-level variable of the storage to use 'AppendOnlyDict' rather than 'OOBTree'. E.g.:
from Products.faster.sessiondata import CBSessionDataContainer from Products.faster.appendict import AppendOnlyDict CBSessionDataContainer._BUCKET_TYPE = AppendOnlyDict
I've changed to: _BUCKET_TYPE = AppendOnlyDict in sessiondata.py for SessionDataContainer and CBSessionDataContainer. Unfortunatelly when I run external method like below concurrently from 3 different browsers (Opera, FF, Konqueror) I still get conflict errors. import time def testme(self): print 'testme started' self.REQUEST.SESSION time.sleep(50) print 'testme after sleep' return 'finished' Faster 'Resolution secs' is set to 20. Console output: 2007-04-12 10:31:50 DEBUG txn.-1260708960 new transaction testme started 2007-04-12 10:31:51 DEBUG txn.-1277494368 new transaction testme started 2007-04-12 10:31:54 DEBUG txn.-1252316256 new transaction testme started testme after sleep 2007-04-12 10:32:40 DEBUG txn.-1260708960 commit <Connection at b562e9ac> 2007-04-12 10:32:40 DEBUG txn.-1260708960 commit testme after sleep 2007-04-12 10:32:41 INFO ZPublisher.Conflict ConflictError at /testme: database conflict error (oid 0x2b, class Products.faster.sessiondata.SessionDataContainer, serial this txn started with 0x036ce4767ce55799 2007-04-12 08:22:29.272469, serial currently committed 0x036ce480ad6d3044 2007-04-12 08:32:40.646840) (3 conflicts (0 unresolved) since startup at Thu Apr 12 10:17:09 2007) 2007-04-12 10:32:41 DEBUG txn.-1277494368 abort 2007-04-12 10:32:41 DEBUG txn.-1277494368 new transaction testme started testme after sleep 2007-04-12 10:32:44 INFO ZPublisher.Conflict ConflictError at /testme: database conflict error (oid 0x2b, class Products.faster.sessiondata.SessionDataContainer, serial this txn started with 0x036ce4767ce55799 2007-04-12 08:22:29.272469, serial currently committed 0x036ce480ad6d3044 2007-04-12 08:32:40.646840) (4 conflicts (0 unresolved) since startup at Thu Apr 12 10:17:09 2007) 2007-04-12 10:32:45 DEBUG txn.-1252316256 abort 2007-04-12 10:32:45 DEBUG txn.-1252316256 new transaction testme started testme after sleep 2007-04-12 10:33:31 DEBUG txn.-1277494368 commit <Connection at b62a8d8c> 2007-04-12 10:33:31 DEBUG txn.-1277494368 commit testme after sleep 2007-04-12 10:33:35 DEBUG txn.-1252316256 commit <Connection at b55b982c> 2007-04-12 10:33:35 DEBUG txn.-1252316256 commit Also after few different tires, eg. with Resolution secs == 300 I ecountered conflicts with AppendOnlyDict like: 2007-04-12 10:53:07 INFO ZPublisher.Conflict ConflictError at /testme: database conflict error (oid 0x4c, class Products.faster.appendict.AppendOnlyDict, serial this txn started with 0x036ce49409621366 2007-04-12 08:52:02.199166, serial currently committed 0x036ce49515ed0477 2007-04-12 08:53:05.138871) (9 conflicts (0 unresolved) since startup at Thu Apr 12 10:17:09 2007) Is there a point in faster where I can put debug message to see that bucket splits (or something like that) happened? -- Maciej Wisniowski
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Maciej Wisniowski wrote:
If you believe that 'faster' is raising conflicts due its own internal data structures (OOBTree bucket splits), rather than in the application-dveined session data, there is a conflict-free alternative available: we found that it was slower than the other, and therefore didn't scale as well, even given the possibility of conflicts.
To enable the conflict-free storage, you need to patch the '_BUCKET_TYPE' class-level variable of the storage to use 'AppendOnlyDict' rather than 'OOBTree'. E.g.:
from Products.faster.sessiondata import CBSessionDataContainer from Products.faster.appendict import AppendOnlyDict CBSessionDataContainer._BUCKET_TYPE = AppendOnlyDict
I've changed to: _BUCKET_TYPE = AppendOnlyDict
in sessiondata.py for SessionDataContainer and CBSessionDataContainer.
Unfortunatelly when I run external method like below concurrently from 3 different browsers (Opera, FF, Konqueror) I still get conflict errors.
import time def testme(self): print 'testme started' self.REQUEST.SESSION time.sleep(50) print 'testme after sleep' return 'finished'
Did you configure the faster SDC to disable the 'lazy' flag? That flag disables modification of the SDC's linked list until the sesion is actually modified, rather than being just accessed as you are doing here.
Faster 'Resolution secs' is set to 20. Console output:
2007-04-12 10:31:50 DEBUG txn.-1260708960 new transaction testme started 2007-04-12 10:31:51 DEBUG txn.-1277494368 new transaction testme started 2007-04-12 10:31:54 DEBUG txn.-1252316256 new transaction testme started testme after sleep 2007-04-12 10:32:40 DEBUG txn.-1260708960 commit <Connection at b562e9ac> 2007-04-12 10:32:40 DEBUG txn.-1260708960 commit testme after sleep 2007-04-12 10:32:41 INFO ZPublisher.Conflict ConflictError at /testme: database conflict error (oid 0x2b, class Products.faster.sessiondata.SessionDataContainer, serial this txn started with 0x036ce4767ce55799 2007-04-12 08:22:29.272469, serial currently committed 0x036ce480ad6d3044 2007-04-12 08:32:40.646840) (3 conflicts (0 unresolved) since startup at Thu Apr 12 10:17:09 2007) 2007-04-12 10:32:41 DEBUG txn.-1277494368 abort 2007-04-12 10:32:41 DEBUG txn.-1277494368 new transaction testme started testme after sleep 2007-04-12 10:32:44 INFO ZPublisher.Conflict ConflictError at /testme: database conflict error (oid 0x2b, class Products.faster.sessiondata.SessionDataContainer, serial this txn started with 0x036ce4767ce55799 2007-04-12 08:22:29.272469, serial currently committed 0x036ce480ad6d3044 2007-04-12 08:32:40.646840) (4 conflicts (0 unresolved) since startup at Thu Apr 12 10:17:09 2007) 2007-04-12 10:32:45 DEBUG txn.-1252316256 abort 2007-04-12 10:32:45 DEBUG txn.-1252316256 new transaction testme started testme after sleep 2007-04-12 10:33:31 DEBUG txn.-1277494368 commit <Connection at b62a8d8c> 2007-04-12 10:33:31 DEBUG txn.-1277494368 commit testme after sleep 2007-04-12 10:33:35 DEBUG txn.-1252316256 commit <Connection at b55b982c> 2007-04-12 10:33:35 DEBUG txn.-1252316256 commit
Also after few different tires, eg. with Resolution secs == 300 I ecountered conflicts with AppendOnlyDict like:
2007-04-12 10:53:07 INFO ZPublisher.Conflict ConflictError at /testme: database conflict error (oid 0x4c, class Products.faster.appendict.AppendOnlyDict, serial this txn started with 0x036ce49409621366 2007-04-12 08:52:02.199166, serial currently committed 0x036ce49515ed0477 2007-04-12 08:53:05.138871) (9 conflicts (0 unresolved) since startup at Thu Apr 12 10:17:09 2007)
That log message says you *are* modifying the AppendOnlyDict from multiple transactions, and it is resolving those conflicts; there are therefore no retried or failed transactions here.
Is there a point in faster where I can put debug message to see that bucket splits (or something like that) happened?
There are no bucket splits when using AOD. You could put a 'pdb.set_trace()' inside its '_p_resolveConflicts' method, to see when it was being called. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGHkPW+gerLs4ltQ4RAlfqAJ4zi+2IwUNV8AA9IT/K1revgMUwTQCfXxJ2 oJ9BfiRzIGcQZrUD1/sSwpQ= =T5oL -----END PGP SIGNATURE-----
Did you configure the faster SDC to disable the 'lazy' flag? That flag disables modification of the SDC's linked list until the sesion is actually modified, rather than being just accessed as you are doing here. I tried both, with and without 'lazy' setting. I used _BUCKET_TYPE = AppendOnlyDict, timeout set to 1200 and timeout resolution: 20
I did three concurrent requests (from three browsers) twice (first request was just to initially create session data objects). With second request, lazy or not, every time I get errors like these attached below. Function that is called sleeps for 50 seconds and just gets the access to SESSION object - doesn't put anything into session. I've added print statements to _p_resolveConflict too but seems this function is only called once in this situation. In general tuning up session resolution seconds helps here. Thanks for your answers I've read about Faster storage and I think I understand this much more now, although I'm not sure about conflicts below. (...) 2007-04-12 21:43:28 DEBUG txn.1090525536 new transaction testme started 2007-04-12 21:43:29 DEBUG txn.1098918240 new transaction testme started 2007-04-12 21:43:30 DEBUG txn.1082132832 new transaction testme started testme after sleep 2007-04-12 21:44:18 DEBUG txn.1090525536 commit <Connection at 2aaab3330110> 2007-04-12 21:44:18 DEBUG txn.1090525536 commit testme after sleep 2007-04-12 21:44:19 INFO ZPublisher.Conflict ConflictError at /testme: database conflict error (oid 0x2b, class Products.faster.sessiondata.SessionDataContainer, serial this txn started with 0x036ce71f217604aa 2007-04-12 19:43:07.842424, serial currently committed 0x036ce7204e9b7a44 2007-04-12 19:44:18.423594) (21 conflicts (0 unresolved) since startup at Thu Apr 12 19:55:58 2007) 2007-04-12 21:44:20 DEBUG txn.1098918240 abort 2007-04-12 21:44:20 DEBUG txn.1098918240 new transaction testme started testme after sleep 2007-04-12 21:44:20 INFO ZPublisher.Conflict ConflictError at /testme: database conflict error (oid 0x2b, class Products.faster.sessiondata.SessionDataContainer, serial this txn started with 0x036ce71f217604aa 2007-04-12 19:43:07.842424, serial currently committed 0x036ce7204e9b7a44 2007-04-12 19:44:18.423594) (22 conflicts (0 unresolved) since startup at Thu Apr 12 19:55:58 2007) 2007-04-12 21:44:21 DEBUG txn.1082132832 abort 2007-04-12 21:44:21 DEBUG txn.1082132832 new transaction testme started testme after sleep 2007-04-12 21:45:10 DEBUG txn.1098918240 commit <Connection at 2aaab31052d0> 2007-04-12 21:45:10 DEBUG txn.1098918240 commit testme after sleep inside of _p_resolveConflict for SessionDataContainer 2007-04-12 21:45:11 DEBUG txn.1082132832 commit <Connection at 2aaab1c75710> 2007-04-12 21:45:11 DEBUG txn.1082132832 commit (...) -- Maciej Wisniowski
Maciej Wisniowski wrote at 2007-4-11 20:28 +0200:
Currently I have some problems with our application (Zope2.8.4) and with Conflict Errors in sessions. In general if we have few concurrent requests that are running sometimes for 3-4 minutes (and they're touching session inside) I get a lot of conflict errors with Inceraser, OOBTree, Length2 etc.
Session conflict errors are very common -- for the following reason: While individual sessions are usually used only by a single request at a time, they share common administrative data structures (e.g. the OOBTree that holds the session objects). The administrative data structures usually have conflict resolution *BUT* conflict resolution requires sufficient history -- and the temporary storage usually holding the sessions only has minimal history. Therefore, conflict resolution is usually ineffective (impossible) and you get lots of conflicts. -- Dieter
participants (4)
-
Dieter Maurer -
Maciej Wisniowski -
Michael Dunstan -
Tres Seaver