Zope 2.6.X, 2.7.x sesssion problems, please help (fwd)
I see Chris post at: http://mail.zope.org/pipermail/zope-dev/2003-March/019116.html As far I see, this bug still existing in Zope 2.6.4, Zope 2.7.0. Its strange because into Changes histore of Zope it seems to be fixed into 2.6.2: Zope 2.6.2 beta 2 Bugs Fixed * TemporaryStorage (which is used by TemporaryFolder, and thus the default sessioning configuration) no longer uses a "LowConflictConnection" database connection. This fixes a bug in which data structures used for session housekeeping data could become desynchronized; the symptom for this was KeyErrors being raised from TransientObjectContainer's get method. As a result, many more conflicts will be raised under high session load, but desynchronization will not occur. Could you please at least point me where I can dig to get it fixed? I will try to read Python code in Transience.Transience but I am not sure in my qualification in this area. Bug reported at zope.org as: http://zope.org/Collectors/Zope/848 but until now there were no comments which help me to fix this. may be something needs to be changed into my code? in the beginning of many scripts I have the following: request=container.REQUEST session=request.SESSION please give me a hint. Thanks. -- Alex V. Koval http://www.halogen-dg.com/ http://www.zwarehouse.org/
Yes, it's still broke. No, there is no fix. Sorry. - C On Tue, 2004-02-24 at 11:43, alex@halogen-dg.com wrote:
I see Chris post at:
http://mail.zope.org/pipermail/zope-dev/2003-March/019116.html
As far I see, this bug still existing in Zope 2.6.4, Zope 2.7.0.
Its strange because into Changes histore of Zope it seems to be fixed into 2.6.2:
Zope 2.6.2 beta 2 Bugs Fixed
* TemporaryStorage (which is used by TemporaryFolder, and thus the default sessioning configuration) no longer uses a "LowConflictConnection" database connection. This fixes a bug in which data structures used for session housekeeping data could become desynchronized; the symptom for this was KeyErrors being raised from TransientObjectContainer's get method. As a result, many more conflicts will be raised under high session load, but desynchronization will not occur.
Could you please at least point me where I can dig to get it fixed? I will try to read Python code in Transience.Transience but I am not sure in my qualification in this area.
Bug reported at zope.org as: http://zope.org/Collectors/Zope/848 but until now there were no comments which help me to fix this.
may be something needs to be changed into my code? in the beginning of many scripts I have the following:
request=container.REQUEST session=request.SESSION
please give me a hint.
Thanks.
-- Alex V. Koval http://www.halogen-dg.com/ http://www.zwarehouse.org/
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Hi On Tue, 24 Feb 2004, Chris McDonough wrote:
Yes, it's still broke. No, there is no fix. Sorry.
Thanks Cris. Thats very bad. It is very bad from end-user perspective that such critical bug has been found more then year ago, and nobody at Zope has fixed their code during a year time. I am trying to read all the things you have discussed in March 2003, but it is very difficult to understand how to fix that. Also, I am in doubt I will be able to do that, because if Zope-experienced developers can not fix it, my chances are low. I just have no choice - we already developed a product for Zope, so I have to support it. Even if it requires hacking of application server itself. I see that fix: #db.klass = LowConflictConnection but it does not work for us. At least we keep seeing the session errors with any Zope version - we tries Zope 2.6.4, Zope 2.7.0 with same results. So, as far I you see, as soon as Zope starts using sessions, and when there are many customers, the site is dead for some users. Once 'get' error occurs to a specific Session, the customer is blocked and is not able to see any page on the site: all pages produce this get error. That is very critital and 'blocker' bug for Zope. I do not understand how people could use Zope on a sites with high load.. The only thing which works is asking customer to delete all Cookies, so session ends, and new one starts. This is very ugly fix, you understand... I am not Zope hacker itself, we are just building sites, we are not developers of Server solutions in most of time, so it takes time to study the Zope code to be able to fix that... I will try to work on this more and more, but ... Its just very difficult. Anybody, experienced with Zope developmnet could help me? I am sure, this problem could happend to anyone running a big Zope site. Is anybody interested in helping to resolve this blocker Zope+Sessions issue? As soon as you start getting more hits to any Zope based site, the more customers will not see it because of Session problem. I do not know how many customers of our site getting this, but we do get it everyday several times a day / person. So, in my understanding Zope Session is completely broken. It works fine for a several customers/ hour, but when things come to 1000 visits/ hour, it just stops functionaning properly. I will be trying to study Zope code.... That is what Open Source for, and I am a big fun of that. But of course, it does not help much if you are not experienced with product itself. It takes time to learn it. But I will try. If anybody interested in help, it would be great. at the current moment I am trying to stable reproduce the prob, but it seems it only happends on a site with high load. -- Alex V. Koval http://www.halogen-dg.com/ http://www.zwarehouse.org/
- C
On Tue, 2004-02-24 at 11:43, alex@halogen-dg.com wrote:
I see Chris post at:
http://mail.zope.org/pipermail/zope-dev/2003-March/019116.html
As far I see, this bug still existing in Zope 2.6.4, Zope 2.7.0.
Its strange because into Changes histore of Zope it seems to be fixed into 2.6.2:
Zope 2.6.2 beta 2 Bugs Fixed
* TemporaryStorage (which is used by TemporaryFolder, and thus the default sessioning configuration) no longer uses a "LowConflictConnection" database connection. This fixes a bug in which data structures used for session housekeeping data could become desynchronized; the symptom for this was KeyErrors being raised from TransientObjectContainer's get method. As a result, many more conflicts will be raised under high session load, but desynchronization will not occur.
Could you please at least point me where I can dig to get it fixed? I will try to read Python code in Transience.Transience but I am not sure in my qualification in this area.
Bug reported at zope.org as: http://zope.org/Collectors/Zope/848 but until now there were no comments which help me to fix this.
may be something needs to be changed into my code? in the beginning of many scripts I have the following:
request=container.REQUEST session=request.SESSION
please give me a hint.
Thanks.
-- Alex V. Koval http://www.halogen-dg.com/ http://www.zwarehouse.org/
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Hi As far I understand the error happends in this code Question to Python developers: As far I understand, index does contain b, and data does not contain 'b'. Simple question: why does the line: v = self._data[b].get(k, notfound) throw KeyError at any case? get does have next argument, and if key is not found, ti will return [], right? It seems I do not understand Python code at this place. Please advice. Alex --------------------------------- [ code from Transience.py ------- def get(self, k, default=_marker): self.lock.acquire() try: DEBUG and TLOG('get: called with k=%s' % k) notfound = [] current = self._getCurrentBucket() DEBUG and TLOG('get: current is %s' % current) if default is _marker: default=None index = self._getIndex() b = index.get(k, notfound) if b is notfound: # it's not here, this is a genuine miss DEBUG and TLOG('bucket was notfound for %s' %k) return default else: v = self._data[b].get(k, notfound) if v is notfound: DEBUG and TLOG( 'get: %s was not found in index bucket (%s)' % (k, b)) return default --------------------------------- [ end code from Transience.py ---- -------------- [error] ---------------- * Module ZPublisher.HTTPRequest, line 1218, in __getattr__ * Module ZPublisher.HTTPRequest, line 1178, in get * Module Products.Sessions.SessionDataManager, line 93, in getSessionData * Module Products.Sessions.SessionDataManager, line 180, in _getSessionDataObject * Module Products.Transience.Transience, line 176, in new_or_existing * Module Products.Transience.Transience, line 809, in get KeyError: 1077572580 -------------- [end of error] --------- -- Alex V. Koval http://www.halogen-dg.com/ http://www.zwarehouse.org/
On Sat, 2004-02-28 at 04:51, alex@halogen-dg.com wrote:
Hi
As far I understand the error happends in this code
Question to Python developers:
As far I understand, index does contain b, and data does not contain 'b'.
Simple question: why does the line:
v = self._data[b].get(k, notfound)
throw KeyError at any case?
The bit of that expression that throws the KeyError is "self._data[b]", not the ".get(k, notfound)"
On Sat, 2004-02-28 at 04:07, alex@halogen-dg.com wrote:
Hi
On Tue, 24 Feb 2004, Chris McDonough wrote:
Yes, it's still broke. No, there is no fix. Sorry.
Thanks Cris. Thats very bad.
Yes it is.
It is very bad from end-user perspective that such critical bug has been found more then year ago, and nobody at Zope has fixed their code during a year time.
Yes it is. Although this is not really Zope Corporation's fault, this is my fault. I should have tested the sessioning machinery better under high load and found that it was broken before putting it in (way back in 2.5). Sessions should just not be in there now, but it's not reasonable to take them out, as they fall over under high load. However, they work just fine under light load, so no one but the people who have high load see the issue. So I'm hoping that someone will take it upon themselves to fix the problem or provide the resources necessary to fix them. I just haven't been able to do so yet.
I am trying to read all the things you have discussed in March 2003, but it is very difficult to understand how to fix that. Also, I am in doubt I will be able to do that, because if Zope-experienced developers can not fix it, my chances are low.
I just have no choice - we already developed a product for Zope, so I have to support it. Even if it requires hacking of application server itself.
I see that fix:
#db.klass = LowConflictConnection
but it does not work for us. At least we keep seeing the session errors with any Zope version - we tries Zope 2.6.4, Zope 2.7.0 with same results.
Right. Doesn't help, apparently.
So, as far I you see, as soon as Zope starts using sessions, and when there are many customers, the site is dead for some users.
Once 'get' error occurs to a specific Session, the customer is blocked and is not able to see any page on the site: all pages produce this get error.
Well, you could work around this in your app code, but obviously it should not be necessary.
That is very critital and 'blocker' bug for Zope. I do not understand how people could use Zope on a sites with high load..
They don't use sessions under high load.
The only thing which works is asking customer to delete all Cookies, so session ends, and new one starts. This is very ugly fix, you understand...
I am not Zope hacker itself, we are just building sites, we are not developers of Server solutions in most of time, so it takes time to study the Zope code to be able to fix that...
I will try to work on this more and more, but ... Its just very difficult.
Right, otherwise it would be fixed by now.
Anybody, experienced with Zope developmnet could help me? I am sure, this problem could happend to anyone running a big Zope site. Is anybody interested in helping to resolve this blocker Zope+Sessions issue?
As soon as you start getting more hits to any Zope based site, the more customers will not see it because of Session problem.
I do not know how many customers of our site getting this, but we do get it everyday several times a day / person. So, in my understanding Zope Session is completely broken. It works fine for a several customers/ hour, but when things come to 1000 visits/ hour, it just stops functionaning properly.
I will be trying to study Zope code.... That is what Open Source for, and I am a big fun of that. But of course, it does not help much if you are not experienced with product itself. It takes time to learn it.
But I will try. If anybody interested in help, it would be great. at the current moment I am trying to stable reproduce the prob, but it seems it only happends on a site with high load.
I can't promise anything, but I will try to help by answering pointed questions where possible. - C
Chris McDonough wrote:
That is very critital and 'blocker' bug for Zope. I do not understand how people could use Zope on a sites with high load..
They don't use sessions under high load.
Or they don't use the standard session machinery. We use my SQLSession stuff, and it's fine under load. -- Anthony Baxter <anthony@interlink.com.au> It's never too late to have a happy childhood.
alex@halogen-dg.com wrote at 2004-2-28 11:07 +0200:
...
Yes, it's still broke. No, there is no fix. Sorry.
Thanks Cris. Thats very bad. It is very bad from end-user perspective that such critical bug has been found more then year ago, and nobody at Zope has fixed their code during a year time.
When we started using Zope sessions, we experienced a set of session problems. I started fixing them, but it turned out that the implementation was very ambitious, too ambitious to get it safe. I, therefore, switched plans and implemented an alternative "Transience" module -- much simpler and less ambitious. We use it since then in a medium load production environment and did not see any more problems. I asked my boss whether I can release this implementation as "Open Source". He said "sure". I still have to hash out a few organisational questions but I expect the module will be available within 2 weeks. -- Dieter
On Sat, 2004-02-28 at 16:32, Dieter Maurer wrote:
When we started using Zope sessions, we experienced a set of session problems. I started fixing them, but it turned out that the implementation was very ambitious, too ambitious to get it safe.
You're very kind to put it that way. ;-)
I, therefore, switched plans and implemented an alternative "Transience" module -- much simpler and less ambitious. We use it since then in a medium load production environment and did not see any more problems.
I asked my boss whether I can release this implementation as "Open Source". He said "sure".
I still have to hash out a few organisational questions but I expect the module will be available within 2 weeks.
I am also trying to reimplement the transientobjectcontainer code at the moment, without an index. - C
When we started using Zope sessions, we experienced a set of session problems. I started fixing them, but it turned out that the implementation was very ambitious, too ambitious to get it safe. I, therefore, switched plans and implemented an alternative "Transience" module -- much simpler and less ambitious. We use it since then in a medium load production environment and did not see any more problems.
I asked my boss whether I can release this implementation as "Open Source". He said "sure".
I still have to hash out a few organisational questions but I expect the module will be available within 2 weeks.
Oh, these are really good news !! Please, announce it very high :-) Thanks a lot -- Santi Camps http://zetadb.sourceforge.net
Well, I was also shamed into creating a new implementation of the sessioning stuff that might work better under higher load. The people that are now having problems should try to replace the file in their Zope software home named "lib/python/Products/Transience/Transience.py" with this one: http://cvs.zope.org/*checkout*/Zope/lib/python/Products/Transience/Transienc... (be sure to make a backup of the original file first) After putting the new file in place, restart Zope and see what happens. This stuff was just checked in, so there's sure to be some edge case bugs, but all the unit tests pass and the UI still works. Please report results back here, if possible. Thanks! - C On Sun, 2004-02-29 at 01:11, Santi Camps wrote:
When we started using Zope sessions, we experienced a set of session problems. I started fixing them, but it turned out that the implementation was very ambitious, too ambitious to get it safe. I, therefore, switched plans and implemented an alternative "Transience" module -- much simpler and less ambitious. We use it since then in a medium load production environment and did not see any more problems.
I asked my boss whether I can release this implementation as "Open Source". He said "sure".
I still have to hash out a few organisational questions but I expect the module will be available within 2 weeks.
Oh, these are really good news !!
Please, announce it very high :-)
Thanks a lot
Hi Chris,
Well, I was also shamed into creating a new implementation of the sessioning stuff that might work better under higher load.
Thank you very much for a fast fix/replacement of Transience :)
The people that are now having problems should try to replace the file in their Zope software home named "lib/python/Products/Transience/Transience.py" with this one:
http://cvs.zope.org/*checkout*/Zope/lib/python/Products/Transience/Transienc...
(be sure to make a backup of the original file first)
I installed new Transience.py. During my little test it works fine. But real test will be on Monday when students start logging in as complete classes, sometimes there are hundreds of them logging on simultaneously, so we will see. I copied original file, so in case of problems, I will put it back and restart Zope.
After putting the new file in place, restart Zope and see what happens. This stuff was just checked in, so there's sure to be some edge case bugs, but all the unit tests pass and the UI still works. Please report results back here, if possible.
I will report if any problem occurs. Alex On Sun, 29 Feb 2004, Chris McDonough wrote:
Thanks!
- C
On Sun, 2004-02-29 at 01:11, Santi Camps wrote:
When we started using Zope sessions, we experienced a set of session problems. I started fixing them, but it turned out that the implementation was very ambitious, too ambitious to get it safe. I, therefore, switched plans and implemented an alternative "Transience" module -- much simpler and less ambitious. We use it since then in a medium load production environment and did not see any more problems.
I asked my boss whether I can release this implementation as "Open Source". He said "sure".
I still have to hash out a few organisational questions but I expect the module will be available within 2 weeks.
Oh, these are really good news !!
Please, announce it very high :-)
Thanks a lot
-- Alex V. Koval http://www.halogen-dg.com/ http://www.zwarehouse.org/
I installed new Transience.py. During my little test it works fine. But real test will be on Monday when students start logging in as complete classes, sometimes there are hundreds of them logging on simultaneously, so we will see.
Any news? ;-)
Chris, I'm not sure if you'd even planned to, but have you looked over this replacement "Transience.py"? Do you have an opinion on the re-implementation? I plan on looking over it myself, but even if I like it, I'd sleep better while it runs on my customer's servers if I knew it had your "blessing". Thanks, Steve Chris McDonough wrote:
I installed new Transience.py. During my little test it works fine. But real test will be on Monday when students start logging in as complete classes, sometimes there are hundreds of them logging on simultaneously, so we will see.
Any news? ;-)
Er, yes... I wrote it. I think you might have mixed up message headers, as I was the one who sent the message out to the list announcing it. On Mon, 2004-03-01 at 18:30, Steve Jibson wrote:
Chris,
I'm not sure if you'd even planned to, but have you looked over this replacement "Transience.py"? Do you have an opinion on the re-implementation? I plan on looking over it myself, but even if I like it, I'd sleep better while it runs on my customer's servers if I knew it had your "blessing".
Thanks, Steve
Chris McDonough wrote:
I installed new Transience.py. During my little test it works fine. But real test will be on Monday when students start logging in as complete classes, sometimes there are hundreds of them logging on simultaneously, so we will see.
Any news? ;-)
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Doh! Scrap my last message. I was confused. The file he's testing is the one Chris wrote (so I guess he's seen it, duh). Chris McDonough wrote:
I installed new Transience.py. During my little test it works fine. But real test will be on Monday when students start logging in as complete classes, sometimes there are hundreds of them logging on simultaneously, so we will see.
Any news? ;-)
Hi Chris, Until now, we did not got any errors with new Transience.py :) It just works, no problems found under high load. Alex On Mon, 1 Mar 2004, Chris McDonough wrote:
I installed new Transience.py. During my little test it works fine. But real test will be on Monday when students start logging in as complete classes, sometimes there are hundreds of them logging on simultaneously, so we will see.
Any news? ;-)
-- Alex V. Koval http://www.halogen-dg.com/ http://www.zwarehouse.org/
Great, I'm going to consider that a resounding endorsement and check it in soon; please do let me know if you see anything odd come up. If anyone else has been having issues with the old Transience module, and would like to provide feedback on the newer implementation, please get this file: http://cvs.zope.org/*checkout*/Products/Transience/Transience.py?rev=1.32.12... ... and temporarily replace Zope's lib/python/Transience/Transience.py with this newer version to help test it out, and report back the results here. Thanks! - C On Wed, 2004-03-03 at 02:14, alex@halogen-dg.com wrote:
Hi Chris,
Until now, we did not got any errors with new Transience.py :) It just works, no problems found under high load.
Alex
On Mon, 1 Mar 2004, Chris McDonough wrote:
I installed new Transience.py. During my little test it works fine. But real test will be on Monday when students start logging in as complete classes, sometimes there are hundreds of them logging on simultaneously, so we will see.
Any news? ;-)
-- Alex V. Koval http://www.halogen-dg.com/ http://www.zwarehouse.org/
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Chris, No, just a few minutes ago I got this again: Time 2004/03/03 07:45:04.662 GMT User Name (User Id) Anonymous User (None) Request URL http://www.chalkface.com/catalog/html/custom/index_html Exception Type KeyError Exception Value 1078236460 Traceback (innermost last): * Module ZPublisher.Publish, line 100, in publish * Module ZPublisher.mapply, line 88, in mapply * Module ZPublisher.Publish, line 40, in call_object * Module OFS.DTMLDocument, line 128, in __call__ <DTMLDocument instance at 41c33890> URL: http://www.chalkface.com/custom/index_html/manage_main Physical Path:/www.chalkface.com/ZWarehouse_0.8/custom/index_html * Module DocumentTemplate.DT_String, line 474, in __call__ * Module OFS.DTMLDocument, line 121, in __call__ <DTMLDocument instance at 41c337a0> URL: http://www.chalkface.com/custom/index.html/manage_main Physical Path:/www.chalkface.com/ZWarehouse_0.8/custom/index.html * Module DocumentTemplate.DT_String, line 474, in __call__ * Module DocumentTemplate.DT_Let, line 76, in render * Module OFS.DTMLDocument, line 121, in __call__ <DTMLDocument instance at 41c2b080> URL: http://www.chalkface.com/catalog/html/zwarehouse_html_header/manage_main Physical Path:/www.chalkface.com/ZWarehouse_0.8/catalog/html/zwarehouse_html_header * Module DocumentTemplate.DT_String, line 474, in __call__ * Module DocumentTemplate.DT_Util, line 201, in eval __traceback_info__: cart_functions * Module <string>, line 1, in <expression> * Module Shared.DC.Scripts.Bindings, line 306, in __call__ * Module Shared.DC.Scripts.Bindings, line 343, in _bindAndExec * Module Products.PythonScripts.PythonScript, line 318, in _exec * Module None, line 16, in setSessionByRequest.py <PythonScript at /www.chalkface.com/ZWarehouse_0.8/catalog/cart_functions/setSessionByRequest.py> Line 16 * Module ZPublisher.HTTPRequest, line 1218, in __getattr__ * Module ZPublisher.HTTPRequest, line 1178, in get * Module Products.Sessions.SessionDataManager, line 93, in getSessionData * Module Products.Sessions.SessionDataManager, line 180, in _getSessionDataObject * Module Products.Transience.Transience, line 491, in new_or_existing * Module Products.Transience.Transience, line 322, in get * Module Products.Transience.Transience, line 198, in _move_item * Module Products.Transience.Transience, line 419, in _gc KeyError: 1078236460 On Wed, 3 Mar 2004, Chris McDonough wrote:
Great, I'm going to consider that a resounding endorsement and check it in soon; please do let me know if you see anything odd come up.
If anyone else has been having issues with the old Transience module, and would like to provide feedback on the newer implementation, please get this file:
http://cvs.zope.org/*checkout*/Products/Transience/Transience.py?rev=1.32.12...
... and temporarily replace Zope's lib/python/Transience/Transience.py with this newer version to help test it out, and report back the results here.
Thanks!
- C
On Wed, 2004-03-03 at 02:14, alex@halogen-dg.com wrote:
Hi Chris,
Until now, we did not got any errors with new Transience.py :) It just works, no problems found under high load.
Alex
On Mon, 1 Mar 2004, Chris McDonough wrote:
I installed new Transience.py. During my little test it works fine. But real test will be on Monday when students start logging in as complete classes, sometimes there are hundreds of them logging on simultaneously, so we will see.
Any news? ;-)
-- Alex V. Koval http://www.halogen-dg.com/ http://www.zwarehouse.org/
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
-- Alex V. Koval http://www.halogen-dg.com/ http://www.zwarehouse.org/
(boldly crossposting this to zodb-dev, please respond on one list or the other but not both) That error *appears* to be caused by reaching a state that is impossible to reach. The code in question is: for key in list(self._data.keys(None, max_ts)): assert(key <= max_ts) STRICT and _assert(self._data.has_key(key)) for v in self._data[key].values(): to_notify.append(v) del self._data[key] The line that says "for v in self._data[key].values()" is the line that throws the KeyError. But it should be impossible for the code to throw a KeyError for the expression "self._data[key]" because the "keys" method of the _data IOBTree just told us that the key named by "key" was one of its keys via the range search; it should be an invariant. Note that in the line above that starts "STRICT and _assert...", I do the paranoid check there as there *have* been cases where BTrees range searches lied in the past. STRICT is not true in your case (it's turned off), so that check never gets run on your system, but if it had, it might have raised an assertion error. I haven't been able to provoke that kind of thing in my own stress tests, unfortunately. I have been proven to be at fault about this sort of thing before, but I've been a good boy and I believe I've applied all of the lessons I learned in the past to the newest code, so I unfortunately again have to reach the conclusion that there is something afoul in the BTrees code, provoked only under high stress scenarios. It's also appears to be very difficult to reproduce. In the end, this means to you that... well.. you've got two choices. a) continue using ZODB-based sessions and helping us (me) to track it down, living with the consequences of the errors in the meantime or b) use a different session implementation. I would prefer "a" but I do need to warn you that this might *never* get solved because the failure mode appears to be so intermittent that it's extremely expensive (in the dollars-and-cents sense) to pin down and ultimately fix, and that may prevent me (and ZC) from doing so. But with a lot of help from other interested people (like yourself) we may be able to coax the failure out of obscurity and squish it without breaking the bank. ;-) Assuming you're interested, what can you do? Well, you could find out a little about the BTrees module in Zope CVS, particularly the "check" module which has code to check a BTree for corruption, and instrument the Transience code to run the check code in the places it seems to be coming up with errors before bombing out. If it's not corrupt, well.. I'm not sure what that means, but it would appear to be a problem with the BTrees range search functions. If it is corrupt, it might exonerate the range search functions. Rinse, lather, repeat with other checks in the code, such as reporting the internal state of the BTree when the error occurs (which I've forgotten how to do, but a maillist search should help), providing information about when conflict errors were raised right before the error, and so on. It's very difficult to provide a concrete "type this, type that" set of steps for this sort of thing due to the latency involved in remote debugging an extremely hard to reproduce failure, so if you want to help best, since you're the person who has access to the machine where the failure appears to be reproducible (and hopefully the motive to want to fix it), you should familiarize yourself with the Transience code and the BTrees APIs and use a bit of inductive logic to help me isolate the problem. If you'd rather not, I can understand that too. ;-) HTH, - C On Wed, 2004-03-03 at 03:18, alex@halogen-dg.com wrote:
Chris,
No, just a few minutes ago I got this again:
Time 2004/03/03 07:45:04.662 GMT User Name (User Id) Anonymous User (None) Request URL http://www.chalkface.com/catalog/html/custom/index_html Exception Type KeyError Exception Value 1078236460
Traceback (innermost last):
* Module ZPublisher.Publish, line 100, in publish * Module ZPublisher.mapply, line 88, in mapply * Module ZPublisher.Publish, line 40, in call_object * Module OFS.DTMLDocument, line 128, in __call__ <DTMLDocument instance at 41c33890> URL: http://www.chalkface.com/custom/index_html/manage_main Physical Path:/www.chalkface.com/ZWarehouse_0.8/custom/index_html * Module DocumentTemplate.DT_String, line 474, in __call__ * Module OFS.DTMLDocument, line 121, in __call__ <DTMLDocument instance at 41c337a0> URL: http://www.chalkface.com/custom/index.html/manage_main Physical Path:/www.chalkface.com/ZWarehouse_0.8/custom/index.html * Module DocumentTemplate.DT_String, line 474, in __call__ * Module DocumentTemplate.DT_Let, line 76, in render * Module OFS.DTMLDocument, line 121, in __call__ <DTMLDocument instance at 41c2b080> URL: http://www.chalkface.com/catalog/html/zwarehouse_html_header/manage_main Physical Path:/www.chalkface.com/ZWarehouse_0.8/catalog/html/zwarehouse_html_header * Module DocumentTemplate.DT_String, line 474, in __call__ * Module DocumentTemplate.DT_Util, line 201, in eval __traceback_info__: cart_functions * Module <string>, line 1, in <expression> * Module Shared.DC.Scripts.Bindings, line 306, in __call__ * Module Shared.DC.Scripts.Bindings, line 343, in _bindAndExec * Module Products.PythonScripts.PythonScript, line 318, in _exec * Module None, line 16, in setSessionByRequest.py <PythonScript at /www.chalkface.com/ZWarehouse_0.8/catalog/cart_functions/setSessionByRequest.py> Line 16 * Module ZPublisher.HTTPRequest, line 1218, in __getattr__ * Module ZPublisher.HTTPRequest, line 1178, in get * Module Products.Sessions.SessionDataManager, line 93, in getSessionData * Module Products.Sessions.SessionDataManager, line 180, in _getSessionDataObject * Module Products.Transience.Transience, line 491, in new_or_existing * Module Products.Transience.Transience, line 322, in get * Module Products.Transience.Transience, line 198, in _move_item * Module Products.Transience.Transience, line 419, in _gc
KeyError: 1078236460
On Wed, 3 Mar 2004, Chris McDonough wrote:
Great, I'm going to consider that a resounding endorsement and check it in soon; please do let me know if you see anything odd come up.
If anyone else has been having issues with the old Transience module, and would like to provide feedback on the newer implementation, please get this file:
http://cvs.zope.org/*checkout*/Products/Transience/Transience.py?rev=1.32.12...
... and temporarily replace Zope's lib/python/Transience/Transience.py with this newer version to help test it out, and report back the results here.
Thanks!
- C
On Wed, 2004-03-03 at 02:14, alex@halogen-dg.com wrote:
Hi Chris,
Until now, we did not got any errors with new Transience.py :) It just works, no problems found under high load.
Alex
On Mon, 1 Mar 2004, Chris McDonough wrote:
I installed new Transience.py. During my little test it works fine. But real test will be on Monday when students start logging in as complete classes, sometimes there are hundreds of them logging on simultaneously, so we will see.
Any news? ;-)
-- Alex V. Koval http://www.halogen-dg.com/ http://www.zwarehouse.org/
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
-- Alex V. Koval http://www.halogen-dg.com/ http://www.zwarehouse.org/
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
On Wed, 2004-03-03 at 04:55, Chris McDonough wrote:
(boldly crossposting this to zodb-dev, please respond on one list or the other but not both)
That error *appears* to be caused by reaching a state that is impossible to reach. The code in question is:
for key in list(self._data.keys(None, max_ts)): assert(key <= max_ts) STRICT and _assert(self._data.has_key(key)) for v in self._data[key].values(): to_notify.append(v) del self._data[key]
I don't have much context for this question. It's definitely the case that a corrupt BTree there are keys you can reach using keys(), which follows the bucket next pointers, that can't reach using a lookup, which follows child pointers down through the interior nodes. If you could call the check functions on the BTrees in question. That's object._check() to check C internals and BTrees.check.check() to check value based consistency. So how is the BTree is question used? If the test is failing here, it seems most likely that the BTree was corrupted by a write somewhere else. Jeremy
Chris McDonough wrote at 2004-3-3 04:55 -0500:
(boldly crossposting this to zodb-dev, please respond on one list or the other but not both)
That error *appears* to be caused by reaching a state that is impossible to reach. The code in question is:
for key in list(self._data.keys(None, max_ts)): assert(key <= max_ts) STRICT and _assert(self._data.has_key(key)) for v in self._data[key].values(): to_notify.append(v) del self._data[key]
The line that says "for v in self._data[key].values()" is the line that throws the KeyError. But it should be impossible for the code to throw a KeyError for the expression "self._data[key]" because the "keys" method of the _data IOBTree just told us that the key named by "key" was one of its keys via the range search; it should be an invariant.
If we had a low conflict connection, I would understand how this could happen: All BTree leaves are chained together. The "keys" method uses this chain to enumerate the keys. "[]" on the other hand, uses the tree structure to locate its key. Assume that a parallel thread removes a key and commits the transaction while we are in the for loop above. We may read the old state (with the later deleted key) for "keys". During the "for", we receive invalidations for the nodes affected by the deletion. When we access "[key]" we may try to load a tree node which is not yet in the ZODB cache and meanwhile invalidated. When we suppress the resulting "ReadConflictError", we may not find "key" (as it is by now deleted). In my "Transience" implementation, I ignore this exceptional case. I do use a low conflict connection and have to be prepared for this situation. Furthermore, the situation is not problematic: I want to determine sessions that should be deleted. Someone else already did it -- this is fine. No need to do it twice... -- Dieter
Hi Chris, On Wed, 3 Mar 2004, Chris McDonough wrote:
(boldly crossposting this to zodb-dev, please respond on one list or the other but not both)
That error *appears* to be caused by reaching a state that is impossible to reach. The code in question is:
for key in list(self._data.keys(None, max_ts)): assert(key <= max_ts) STRICT and _assert(self._data.has_key(key)) for v in self._data[key].values(): to_notify.append(v) del self._data[key]
I was not working yesterday, now I found a big thread about the problem here :) Thats good that people are interested into resolving of this bug. I will read all the mails now, and will try to help to resolve it, since we have system where the high load causes such problems. By the way, just a few minutes ago I have found another session error, with a little different traceback then reported ago, so I am posting it here, just in case it helps you understand the prob. I am still thinking may be something wrong with my code? -------------------------- traceback ------------------------------ ═ Site Error An error was encountered while publishing this resource. KeyError Sorry, a site error occurred. Traceback (innermost last): ∙ Module ZPublisher.Publish, line 163, in publish_module_standard ∙ Module Products.iHotfix, line 80, in new_publish ∙ Module ZPublisher.Publish, line 127, in publish ∙ Module Zope.App.startup, line 203, in zpublisher_exception_hook ∙ Module ZPublisher.Publish, line 100, in publish ∙ Module ZPublisher.mapply, line 88, in mapply ∙ Module ZPublisher.Publish, line 40, in call_object ∙ Module OFS.DTMLDocument, line 128, in __call__ <DTMLDocument instance at 41bcf6e0> URL: http://www.chalkface.com/custom/index_html/manage_main Physical Path:/www.chalkface.com/ZWarehouse_0.8/custom/index_html ∙ Module DocumentTemplate.DT_String, line 474, in __call__ ∙ Module OFS.DTMLDocument, line 121, in __call__ <DTMLDocument instance at 41bcf5f0> URL: http://www.chalkface.com/custom/index.html/manage_main Physical Path:/www.chalkface.com/ZWarehouse_0.8/custom/index.html ∙ Module DocumentTemplate.DT_String, line 474, in __call__ ∙ Module DocumentTemplate.DT_Let, line 76, in render ∙ Module OFS.DTMLDocument, line 121, in __call__ <DTMLDocument instance at 41b5d770> URL: http://www.chalkface.com/catalog/html/zwarehouse_html_header/ manage_main Physical Path:/www.chalkface.com/ZWarehouse_0.8/catalog/html/ zwarehouse_html_header ∙ Module DocumentTemplate.DT_String, line 474, in __call__ ∙ Module DocumentTemplate.DT_Util, line 201, in eval __traceback_info__: cart_functions ∙ Module <string>, line 1, in <expression> ∙ Module Shared.DC.Scripts.Bindings, line 306, in __call__ ∙ Module Shared.DC.Scripts.Bindings, line 343, in _bindAndExec ∙ Module Products.PythonScripts.PythonScript, line 318, in _exec ∙ Module None, line 16, in setSessionByRequest.py <PythonScript at /www.chalkface.com/ZWarehouse_0.8/catalog/cart_functions/ setSessionByRequest.py> Line 16 ∙ Module ZPublisher.HTTPRequest, line 1218, in __getattr__ ∙ Module ZPublisher.HTTPRequest, line 1178, in get ∙ Module Products.Sessions.SessionDataManager, line 93, in getSessionData ∙ Module Products.Sessions.SessionDataManager, line 180, in _getSessionDataObject ∙ Module Products.Transience.Transience, line 494, in new_or_existing ∙ Module Products.Transience.Transience, line 304, in __setitem__ KeyError: 1078473960 (Also, an error occurred while attempting to render the standard error message.) Troubleshooting Suggestions ∙ The URL may be incorrect. ∙ The parameters passed to this resource may be incorrect. ∙ A resource that this resource relies on may be encountering an error. For more detailed information about the error, please refer to error log. If the error persists please contact the site maintainer. Thank you for your patience. -- Alex V. Koval http://www.halogen-dg.com/ http://www.zwarehouse.org/
alex@halogen-dg.com wrote:
* Module Products.Transience.Transience, line 419, in _gc
KeyError: 1078236460
(on the other Chris's behalf) waaaaagh! Chris ;-) -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
Hi Chris, On Wed, 3 Mar 2004, Chris McDonough wrote:
Great, I'm going to consider that a resounding endorsement and check it in soon; please do let me know if you see anything odd come up.
If anyone else has been having issues with the old Transience module, and would like to provide feedback on the newer implementation, please get this file:
http://cvs.zope.org/*checkout*/Products/Transience/Transience.py?rev=1.32.12...
... and temporarily replace Zope's lib/python/Transience/Transience.py with this newer version to help test it out, and report back the results here.
I am using new Transience.py, and my temp_folder is on Sessions.fs ZODB now. I have one problem with it - it does not seems that this way it deletes old expired Sessions. The number of objects grow and grow, and today we reached limit. I think I have to delete Sessions.fs every night and restart Zope. Is it expected expected behavior when using file storage? I was thinking that only problem of this kind of storage is the need to pack the database sometimes. ---------- Forwarded message ---------- Date: Tue, 20 Apr 2004 08:52:00 +0100 From: atest@localhost.localdomain To: alex@halogen-dg.com Subject: test failed: http://www.chalkface.com/catalog/html/custom/index.html?c_category_id=1 Testing URL http://www.chalkface.com/catalog/html/custom/index.html?c_category_id=1 ... test #1 - failure, code 500 test #2 - failure, code 500 ------------ [Details] ----------- Site Error An error was encountered while publishing this resource. MaxTransientObjectsExceeded Sorry, a site error occurred. Traceback (innermost last): * Module ZPublisher.Publish, line 163, in publish_module_standard * Module Products.iHotfix, line 80, in new_publish * Module ZPublisher.Publish, line 127, in publish * Module Zope.App.startup, line 203, in zpublisher_exception_hook * Module ZPublisher.Publish, line 100, in publish * Module ZPublisher.mapply, line 88, in mapply * Module ZPublisher.Publish, line 40, in call_object * Module OFS.DTMLDocument, line 128, in __call__ <DTMLDocument instance at 41156d40> URL: http://www.chalkface.com/custom/index.html/manage_main Physical Path:/www.chalkface.com/ZWarehouse_0.8/custom/index.html * Module DocumentTemplate.DT_String, line 474, in __call__ * Module DocumentTemplate.DT_Let, line 76, in render * Module OFS.DTMLDocument, line 121, in __call__ <DTMLDocument instance at 4114fa40> URL: http://www.chalkface.com/catalog/html/zwarehouse_html_header/manag e_main Physical Path:/www.chalkface.com/ZWarehouse_0.8/catalog/html/zwarehouse_htm l_header * Module DocumentTemplate.DT_String, line 474, in __call__ * Module DocumentTemplate.DT_Util, line 201, in eval __traceback_info__: cart_functions * Module <string>, line 1, in <expression> * Module Shared.DC.Scripts.Bindings, line 306, in __call__ * Module Shared.DC.Scripts.Bindings, line 343, in _bindAndExec * Module Products.PythonScripts.PythonScript, line 318, in _exec * Module None, line 16, in setSessionByRequest.py <PythonScript at /www.chalkface.com/ZWarehouse_0.8/catalog/cart_functions/setSessio nByRequest.py> Line 16 * Module ZPublisher.HTTPRequest, line 1218, in __getattr__ * Module ZPublisher.HTTPRequest, line 1178, in get * Module Products.Sessions.SessionDataManager, line 93, in getSessionData * Module Products.Sessions.SessionDataManager, line 180, in _getSessionDataObject * Module Products.Transience.Transience, line 494, in new_or_existing * Module Products.Transience.Transience, line 300, in __setitem__ MaxTransientObjectsExceeded: 10000 exceeds maximum number of subobjects 10000 (Also, an error occurred while attempting to render the standard error message.) _________________________________________________________________ Troubleshooting Suggestions * The URL may be incorrect. * The parameters passed to this resource may be incorrect. * A resource that this resource relies on may be encountering an error. For more detailed information about the error, please refer to error log. If the error persists please contact the site maintainer. Thank you for your patience. -- Alex V. Koval http://www.halogen-dg.com/ http://www.zwarehouse.org/
On Tue, 2004-04-20 at 10:28, alex@halogen-dg.com wrote:
Hi Chris,
On Wed, 3 Mar 2004, Chris McDonough wrote:
Great, I'm going to consider that a resounding endorsement and check it in soon; please do let me know if you see anything odd come up.
If anyone else has been having issues with the old Transience module, and would like to provide feedback on the newer implementation, please get this file:
http://cvs.zope.org/*checkout*/Products/Transience/Transience.py?rev=1.32.12...
... and temporarily replace Zope's lib/python/Transience/Transience.py with this newer version to help test it out, and report back the results here.
I am using new Transience.py, and my temp_folder is on Sessions.fs ZODB now. I have one problem with it - it does not seems that this way it deletes old expired Sessions. The number of objects grow and grow, and today we reached limit.
You reached a disk space limit? Or a number of session objects limit?
I think I have to delete Sessions.fs every night and restart Zope. Is it expected expected behavior when using file storage? I was thinking that only problem of this kind of storage is the need to pack the database sometimes.
That was the intent. You did pack and it didn't reduce the file size?
---------- Forwarded message ---------- Date: Tue, 20 Apr 2004 08:52:00 +0100 From: atest@localhost.localdomain To: alex@halogen-dg.com Subject: test failed: http://www.chalkface.com/catalog/html/custom/index.html?c_category_id=1
Testing URL http://www.chalkface.com/catalog/html/custom/index.html?c_category_id=1 ... test #1 - failure, code 500 test #2 - failure, code 500 ------------ [Details] -----------
Site Error
An error was encountered while publishing this resource.
MaxTransientObjectsExceeded Sorry, a site error occurred.
Traceback (innermost last): * Module ZPublisher.Publish, line 163, in publish_module_standard * Module Products.iHotfix, line 80, in new_publish * Module ZPublisher.Publish, line 127, in publish * Module Zope.App.startup, line 203, in zpublisher_exception_hook * Module ZPublisher.Publish, line 100, in publish * Module ZPublisher.mapply, line 88, in mapply * Module ZPublisher.Publish, line 40, in call_object * Module OFS.DTMLDocument, line 128, in __call__ <DTMLDocument instance at 41156d40> URL: http://www.chalkface.com/custom/index.html/manage_main Physical Path:/www.chalkface.com/ZWarehouse_0.8/custom/index.html * Module DocumentTemplate.DT_String, line 474, in __call__ * Module DocumentTemplate.DT_Let, line 76, in render * Module OFS.DTMLDocument, line 121, in __call__ <DTMLDocument instance at 4114fa40> URL: http://www.chalkface.com/catalog/html/zwarehouse_html_header/manag e_main Physical Path:/www.chalkface.com/ZWarehouse_0.8/catalog/html/zwarehouse_htm l_header * Module DocumentTemplate.DT_String, line 474, in __call__ * Module DocumentTemplate.DT_Util, line 201, in eval __traceback_info__: cart_functions * Module <string>, line 1, in <expression> * Module Shared.DC.Scripts.Bindings, line 306, in __call__ * Module Shared.DC.Scripts.Bindings, line 343, in _bindAndExec * Module Products.PythonScripts.PythonScript, line 318, in _exec * Module None, line 16, in setSessionByRequest.py <PythonScript at /www.chalkface.com/ZWarehouse_0.8/catalog/cart_functions/setSessio nByRequest.py> Line 16 * Module ZPublisher.HTTPRequest, line 1218, in __getattr__ * Module ZPublisher.HTTPRequest, line 1178, in get * Module Products.Sessions.SessionDataManager, line 93, in getSessionData * Module Products.Sessions.SessionDataManager, line 180, in _getSessionDataObject * Module Products.Transience.Transience, line 494, in new_or_existing * Module Products.Transience.Transience, line 300, in __setitem__
MaxTransientObjectsExceeded: 10000 exceeds maximum number of subobjects 10000 (Also, an error occurred while attempting to render the standard error message.) _________________________________________________________________
Troubleshooting Suggestions * The URL may be incorrect. * The parameters passed to this resource may be incorrect. * A resource that this resource relies on may be encountering an error.
For more detailed information about the error, please refer to error log.
If the error persists please contact the site maintainer. Thank you for your patience.
-- Alex V. Koval http://www.halogen-dg.com/ http://www.zwarehouse.org/
Hi Chris, On Tue, 20 Apr 2004, Chris McDonough wrote:
I am using new Transience.py, and my temp_folder is on Sessions.fs ZODB now. I have one problem with it - it does not seems that this way it deletes old expired Sessions. The number of objects grow and grow, and today we reached limit.
You reached a disk space limit? Or a number of session objects limit?
We have more then 10gb of free disk space. No, I reached the session objects limit. It was set as 10000, now I set it as 50000, and the counter is going higher every day.
I think I have to delete Sessions.fs every night and restart Zope. Is it expected expected behavior when using file storage? I was thinking that only problem of this kind of storage is the need to pack the database sometimes.
That was the intent. You did pack and it didn't reduce the file size?
Yes, I packed it, size reduced, but the number of session objects still the same. And keep growing. Today morning stats (nobody works now, people still slepping at England): 12567 items are in this transient object container. Data object timeout value in minutes: 20 Maximum number of subobjects: 50000 Yesterday there was only 10000 session objects. Now, I am packing ZODB: --- before pack --- Database Location: /home/zope/current2/var/Sessions.fs Database Size: 6.2M Transient Object Container at /temp_folder/session_data 12568 items are in this transient object container. --- after pack ---- Database Location: /home/zope/current2/var/Sessions.fs Database Size: 59.8K Transient Object Container at /temp_folder/session_data 12570 items are in this transient object container. -- Alex V. Koval http://www.halogen-dg.com/ http://www.zwarehouse.org/
What do you have the transient object timeout set for? On Wed, 2004-04-21 at 02:57, alex@halogen-dg.com wrote:
Hi Chris,
On Tue, 20 Apr 2004, Chris McDonough wrote:
I am using new Transience.py, and my temp_folder is on Sessions.fs ZODB now. I have one problem with it - it does not seems that this way it deletes old expired Sessions. The number of objects grow and grow, and today we reached limit.
You reached a disk space limit? Or a number of session objects limit?
We have more then 10gb of free disk space. No, I reached the session objects limit. It was set as 10000, now I set it as 50000, and the counter is going higher every day.
I think I have to delete Sessions.fs every night and restart Zope. Is it expected expected behavior when using file storage? I was thinking that only problem of this kind of storage is the need to pack the database sometimes.
That was the intent. You did pack and it didn't reduce the file size?
Yes, I packed it, size reduced, but the number of session objects still the same. And keep growing.
Today morning stats (nobody works now, people still slepping at England):
12567 items are in this transient object container.
Data object timeout value in minutes: 20
Maximum number of subobjects: 50000
Yesterday there was only 10000 session objects. Now, I am packing ZODB:
--- before pack --- Database Location: /home/zope/current2/var/Sessions.fs Database Size: 6.2M Transient Object Container at /temp_folder/session_data 12568 items are in this transient object container. --- after pack ---- Database Location: /home/zope/current2/var/Sessions.fs Database Size: 59.8K Transient Object Container at /temp_folder/session_data 12570 items are in this transient object container.
-- Alex V. Koval http://www.halogen-dg.com/ http://www.zwarehouse.org/
Hi Chris, On Wed, 21 Apr 2004 11:36:59 -0400, Chris McDonough <chrism@plope.com> wrote:
What do you have the transient object timeout set for?
Do you mean this (/temp_folder/session_data):
Data object timeout value in minutes: 20
Also, here is a part of zope.conf for your reference: # from Chris <zodb_db temporary> # Temporary storage database (for sessions) <filestorage> path $INSTANCE/var/Sessions.fs </filestorage> mount-point /temp_folder container-class Products.TemporaryFolder.TemporaryContainer </zodb_db> Regards. -- Alex V. Koval http://www.halogen-dg.com/ http://www.zwarehouse.org/
The data object timeout value was what I needed. I'm afraid though that in this case I don't have a good answer for why they're not being expired. I will put looking into this on my todo list. On Tue, 2004-05-04 at 09:32, Alex V. Koval wrote:
Hi Chris,
On Wed, 21 Apr 2004 11:36:59 -0400, Chris McDonough <chrism@plope.com> wrote:
What do you have the transient object timeout set for?
Do you mean this (/temp_folder/session_data):
Data object timeout value in minutes: 20
Also, here is a part of zope.conf for your reference: # from Chris <zodb_db temporary> # Temporary storage database (for sessions) <filestorage> path $INSTANCE/var/Sessions.fs </filestorage> mount-point /temp_folder container-class Products.TemporaryFolder.TemporaryContainer </zodb_db>
Regards.
No, that's not the problem; in THEORY that's what is happening, but in reality there is no way that this is the case; We just unrolled a registration system with participation rates at or around 100 to 200 participants per month; At any given time, monitoring the session data container, there are *at most* 1 or 2 items in the transient object container--EXCEPT when it spikes... The problem is of course, when it floods, the MaxTransientObjectsExceeded error occurs site wide (the whole site crumbles and returns the error); because it's just passing back to Exception, there aren't really any useful details that get carried forward to the log (why it occured, last script that executed it, etc). So, question--is there a way for these errors to occur internally (i.e. an improperly looping script setting null values into session, on-error demanding that session create a new object to try it again, ad infinitum), or is it possible that an external barrage of requests (denial-of-service?) is flooding the transient object container? ideas? k Chris McDonough wrote:
The data object timeout value was what I needed. I'm afraid though that in this case I don't have a good answer for why they're not being expired. I will put looking into this on my todo list.
On Tue, 2004-05-04 at 09:32, Alex V. Koval wrote:
Hi Chris,
On Wed, 21 Apr 2004 11:36:59 -0400, Chris McDonough <chrism@plope.com> wrote:
What do you have the transient object timeout set for?
Do you mean this (/temp_folder/session_data):
Data object timeout value in minutes: 20
Also, here is a part of zope.conf for your reference: # from Chris <zodb_db temporary> # Temporary storage database (for sessions) <filestorage> path $INSTANCE/var/Sessions.fs </filestorage> mount-point /temp_folder container-class Products.TemporaryFolder.TemporaryContainer </zodb_db>
Regards.
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
On 7/05/2004, at 5:15 AM, Kris Erickson wrote:
No, that's not the problem; in THEORY that's what is happening, but in reality there is no way that this is the case; We just unrolled a registration system with participation rates at or around 100 to 200 participants per month; At any given time, monitoring the session data container, there are *at most* 1 or 2 items in the transient object container--EXCEPT when it spikes...
I have seen such spikes occur (in a corner case) where some breads of web robots were aggressively hitting a page that used sessions. These robots did not bother to return the cookie handed out by the server. Each page hit effectively constructs a new session. Have a look through your access logs to see if can see signs of something similar happening. Not all web robots are created equal. I ended up sniffing for the user agent and returning a page that does not use sessions for the offending robots. (From memory, robots.txt was not useful for this bread.) Alternatively you can set the maximum-number-of-session-objects to something a lot higher and see if you can just live through the bot invasion. Michael.
Michael Dunstan wrote:
On 7/05/2004, at 5:15 AM, Kris Erickson wrote:
No, that's not the problem; in THEORY that's what is happening, but in reality there is no way that this is the case; We just unrolled a registration system with participation rates at or around 100 to 200 participants per month; At any given time, monitoring the session data container, there are *at most* 1 or 2 items in the transient object container--EXCEPT when it spikes...
I have seen such spikes occur (in a corner case) where some breads of web robots were aggressively hitting a page that used sessions. These robots did not bother to return the cookie handed out by the server. Each page hit effectively constructs a new session.
Have a look through your access logs to see if can see signs of something similar happening.
Not all web robots are created equal. I ended up sniffing for the user agent and returning a page that does not use sessions for the offending robots. (From memory, robots.txt was not useful for this bread.) Alternatively you can set the maximum-number-of-session-objects to something a lot higher and see if you can just live through the bot invasion.
Even better, avoid writing to the session on each request! Your application will be *much* happier if you write to the session only when the human makes a gesture; neither bots nor casually-browsing humans will consume sessions, but only session keys (which are cheap). Tres. -- =============================================================== Tres Seaver tseaver@zope.com Zope Corporation "Zope Dealers" http://www.zope.com
On 7/05/2004, at 4:39 PM, Tres Seaver wrote:
Michael Dunstan wrote:
On 7/05/2004, at 5:15 AM, Kris Erickson wrote:
No, that's not the problem; in THEORY that's what is happening, but in reality there is no way that this is the case; We just unrolled a registration system with participation rates at or around 100 to 200 participants per month; At any given time, monitoring the session data container, there are *at most* 1 or 2 items in the transient object container--EXCEPT when it spikes... I have seen such spikes occur (in a corner case) where some breads of web robots were aggressively hitting a page that used sessions. These robots did not bother to return the cookie handed out by the server. Each page hit effectively constructs a new session. Have a look through your access logs to see if can see signs of something similar happening. Not all web robots are created equal. I ended up sniffing for the user agent and returning a page that does not use sessions for the offending robots. (From memory, robots.txt was not useful for this bread.) Alternatively you can set the maximum-number-of-session-objects to something a lot higher and see if you can just live through the bot invasion.
Even better, avoid writing to the session on each request! Your application will be *much* happier if you write to the session only when the human makes a gesture; neither bots nor casually-browsing humans will consume sessions, but only session keys (which are cheap).
Yup - certainly that is a whole lot better if you can arrange for that. Michael.
We're using a shopping cart model; sessions only get created if the user 'adds' a workshop to their cart. Unless there's anything I'm missing in Plone... the _ZopeId cookie doesn't seem to start up a session (lazy data container?) until a script actually creates says session['key'] = value... or am I missing the boat here? There is a bot floating around for the univ. search engine, but i still don't think that's it. Again, my guess is the bad looping (i.e. trying to set session values from form values ASSUMING that form values exist). This seems in line with my case: a rapidly developed admin interface with buttons existing for cases that haven't been flushed out yet. Anyway thanks--it hasn't recurred since I cleaned up those loose ends; however, I'm still concerned that the log msg didn't give a clear pic of the root of the problem. cheers, k Tres Seaver wrote:
Michael Dunstan wrote:
On 7/05/2004, at 5:15 AM, Kris Erickson wrote:
No, that's not the problem; in THEORY that's what is happening, but in reality there is no way that this is the case; We just unrolled a registration system with participation rates at or around 100 to 200 participants per month; At any given time, monitoring the session data container, there are *at most* 1 or 2 items in the transient object container--EXCEPT when it spikes...
I have seen such spikes occur (in a corner case) where some breads of web robots were aggressively hitting a page that used sessions. These robots did not bother to return the cookie handed out by the server. Each page hit effectively constructs a new session.
Have a look through your access logs to see if can see signs of something similar happening.
Not all web robots are created equal. I ended up sniffing for the user agent and returning a page that does not use sessions for the offending robots. (From memory, robots.txt was not useful for this bread.) Alternatively you can set the maximum-number-of-session-objects to something a lot higher and see if you can just live through the bot invasion.
Even better, avoid writing to the session on each request! Your application will be *much* happier if you write to the session only when the human makes a gesture; neither bots nor casually-browsing humans will consume sessions, but only session keys (which are cheap).
Tres.
I figured out what this is; it's a genuine bug, sorry. Until I get it fixed, please comment out these lines of Transience.py to make things OK: if self._limit and len(self) >= self._limit: LOG('Transience', WARNING, ('Transient object container %s max subobjects ' 'reached' % self.getId()) ) raise MaxTransientObjectsExceeded, ( "%s exceeds maximum number of subobjects %s" % (len(self), self._limit)) On Wed, 2004-04-21 at 02:57, alex@halogen-dg.com wrote:
Hi Chris,
On Tue, 20 Apr 2004, Chris McDonough wrote:
I am using new Transience.py, and my temp_folder is on Sessions.fs ZODB now. I have one problem with it - it does not seems that this way it deletes old expired Sessions. The number of objects grow and grow, and today we reached limit.
You reached a disk space limit? Or a number of session objects limit?
We have more then 10gb of free disk space. No, I reached the session objects limit. It was set as 10000, now I set it as 50000, and the counter is going higher every day.
I think I have to delete Sessions.fs every night and restart Zope. Is it expected expected behavior when using file storage? I was thinking that only problem of this kind of storage is the need to pack the database sometimes.
That was the intent. You did pack and it didn't reduce the file size?
Yes, I packed it, size reduced, but the number of session objects still the same. And keep growing.
Today morning stats (nobody works now, people still slepping at England):
12567 items are in this transient object container.
Data object timeout value in minutes: 20
Maximum number of subobjects: 50000
Yesterday there was only 10000 session objects. Now, I am packing ZODB:
--- before pack --- Database Location: /home/zope/current2/var/Sessions.fs Database Size: 6.2M Transient Object Container at /temp_folder/session_data 12568 items are in this transient object container. --- after pack ---- Database Location: /home/zope/current2/var/Sessions.fs Database Size: 59.8K Transient Object Container at /temp_folder/session_data 12570 items are in this transient object container.
-- Alex V. Koval http://www.halogen-dg.com/ http://www.zwarehouse.org/
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
I've fixed this bug (and several others) and checked the result into the Zope 2.7 branch. You can get it at http://cvs.zope.org/*checkout*/Zope/lib/python/Products/Transience/Transienc... HTH, - C On Fri, 2004-05-14 at 15:42, Chris McDonough wrote:
I figured out what this is; it's a genuine bug, sorry. Until I get it fixed, please comment out these lines of Transience.py to make things OK:
if self._limit and len(self) >= self._limit: LOG('Transience', WARNING, ('Transient object container %s max subobjects ' 'reached' % self.getId()) ) raise MaxTransientObjectsExceeded, ( "%s exceeds maximum number of subobjects %s" % (len(self), self._limit))
On Wed, 2004-04-21 at 02:57, alex@halogen-dg.com wrote:
Hi Chris,
On Tue, 20 Apr 2004, Chris McDonough wrote:
I am using new Transience.py, and my temp_folder is on Sessions.fs ZODB now. I have one problem with it - it does not seems that this way it deletes old expired Sessions. The number of objects grow and grow, and today we reached limit.
You reached a disk space limit? Or a number of session objects limit?
We have more then 10gb of free disk space. No, I reached the session objects limit. It was set as 10000, now I set it as 50000, and the counter is going higher every day.
I think I have to delete Sessions.fs every night and restart Zope. Is it expected expected behavior when using file storage? I was thinking that only problem of this kind of storage is the need to pack the database sometimes.
That was the intent. You did pack and it didn't reduce the file size?
Yes, I packed it, size reduced, but the number of session objects still the same. And keep growing.
Today morning stats (nobody works now, people still slepping at England):
12567 items are in this transient object container.
Data object timeout value in minutes: 20
Maximum number of subobjects: 50000
Yesterday there was only 10000 session objects. Now, I am packing ZODB:
--- before pack --- Database Location: /home/zope/current2/var/Sessions.fs Database Size: 6.2M Transient Object Container at /temp_folder/session_data 12568 items are in this transient object container. --- after pack ---- Database Location: /home/zope/current2/var/Sessions.fs Database Size: 59.8K Transient Object Container at /temp_folder/session_data 12570 items are in this transient object container.
-- Alex V. Koval http://www.halogen-dg.com/ http://www.zwarehouse.org/
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Hi, I am not sure, but it if Locking is global thing, I think, that if could happend the following way: 1. some function makes self.lock.acquire() 2. it calls another function, which have 2 calls: self.lock.acquire() self.lock.release() since the locking is done globally, most probably, it will mean that call to _getIndex for example will release the lock, and the processin in upper function will continue with undefined results (most possible _housekeep in different threat will delete the key same time we are trying to read the data). May be I am wrong... Most important now to understand for me why the error persists. It means, that once a user see the Session key Error, the user will keep getting it all the time, on all site, until _ZopeId cookie is deleted for the Session. -- Alex V. Koval http://www.halogen-dg.com/ http://www.zwarehouse.org/
participants (12)
-
Alex V. Koval -
alex@halogen-dg.com -
Anthony Baxter -
Chris McDonough -
Chris Withers -
Dieter Maurer -
Jeremy Hylton -
Kris Erickson -
Michael Dunstan -
Santi Camps -
Steve Jibson -
Tres Seaver