I have a single folder in a zope instance that contains about 8000 pdf files. Each of those files is indexed using TextIndexNG (1.09), along with having a Subject and Type index that are both Field indexes. I am using portal_catalog under CMF 1.2 , Zope 2.6.1, python 2.1.3. Querying the catalog has begun to take much more time recently and I am wondering if there are any things that I can check into regarding what might be slowing things down. I have a simple query returned from a MySQL database that in turn drives a search of the catalog on Subject and Type. I am using the following syntax: ... query = context.getDBSubjectForUser(userID=userID) for q in query : docResults = context.portal_catalog({'Subject' : q.subject, 'Type' : 'Template'}) ... This query is taking quite a while and occasionally errs out with a ZODB conflict error for certain subjects returned from the database(traceback below) : ---------------------------------------------- Site Error An error was encountered while publishing this resource. *ZODB.POSException.ConflictError* Sorry, a site error occurred. Traceback (innermost last): * Module ZPublisher.Publish, line 150, in publish_module * Module ZPublisher.Publish, line 127, in publish * Module ZPublisher.Publish, line 127, in publish * Module ZPublisher.Publish, line 127, in publish * Module ZPublisher.Publish, line 122, in publish * Module Zope.App.startup, line 142, in zpublisher_exception_hook * Module ZPublisher.Publish, line 102, in publish * Module Zope.App.startup, line 200, in commit * Module ZODB.Transaction, line 235, in commit * Module ZODB.Transaction, line 349, in _commit_objects * Module ZODB.Connection, line 391, in commit __traceback_info__: (('Products.Transience.Transience', 'Increaser'), '\x00\x00\x00\x00\x00\x00\x00\x06', '') * Module Products.TemporaryFolder.TemporaryStorage, line 134, in store ConflictError: database conflict error (oid 0000000000000006, serial was 0353d48778e55b88, now 0353d485d3dba833) ---------------------------------------------- Any ideas what could be causing this type of behavior? Thanks, Kevin
It sounds like this system has a high write rate. These writes are causing the catalog to get updated often. This could explain both the poor performance and the errors. When a write occurs while simultaneously performing a query, it can cause a read conflict in the query because one of the objects in the index changed while the query was being processed. This causes the query to be retried. If another conflict is detected, the query will be retried again (up to 3 times). These retries slow things down a lot and if conflicts continue to happen, they will eventually cause the error you reported. You say you have 8000 pdfs in a single folder. What type of folder is this? How often are new files added, or old files deleted? If you are using a standard zope folder (not a BTreeFolder), then updates to it will be *very* slow with that many children. Slow transactions like that tend to cause conflicts elsewhere (or get retried themselves alot) which exacerbates the problem. The query you mention (against two field indexes) should be pretty cheap, however it's unclear how many times it gets executed in the loop. It would be cheaper to do this instead of querying in the loop. query = context.getDBSubjectForUser(userID=userID) subjects = [q.subject for q in query] docResults = context.portal_catalog( {'Subject' : subjects, 'Type' : 'Template'}) Also, what are you doing with docResults when you get it back? Are you calling getObject on the results returned? -Casey On Thu, 18 Mar 2004 13:15:45 -0500 Kevin Carlson <khcarlso@bellsouth.net> wrote:
I have a single folder in a zope instance that contains about 8000 pdf
files. Each of those files is indexed using TextIndexNG (1.09), along with having a Subject and Type index that are both Field indexes. I am using portal_catalog under CMF 1.2 , Zope 2.6.1, python 2.1.3.
Querying the catalog has begun to take much more time recently and I am wondering if there are any things that I can check into regarding what might be slowing things down. I have a simple query returned from a MySQL database that in turn drives a search of the catalog on Subject and Type. I am using the following syntax:
... query = context.getDBSubjectForUser(userID=userID) for q in query : docResults = context.portal_catalog({'Subject' : q.subject, 'Type' :
'Template'}) ...
This query is taking quite a while and occasionally errs out with a ZODB conflict error for certain subjects returned from the database(traceback below) :
----------------------------------------------
Site Error
An error was encountered while publishing this resource.
*ZODB.POSException.ConflictError*
Sorry, a site error occurred.
Traceback (innermost last):
* Module ZPublisher.Publish, line 150, in publish_module * Module ZPublisher.Publish, line 127, in publish * Module ZPublisher.Publish, line 127, in publish * Module ZPublisher.Publish, line 127, in publish * Module ZPublisher.Publish, line 122, in publish * Module Zope.App.startup, line 142, in zpublisher_exception_hook * Module ZPublisher.Publish, line 102, in publish * Module Zope.App.startup, line 200, in commit * Module ZODB.Transaction, line 235, in commit * Module ZODB.Transaction, line 349, in _commit_objects * Module ZODB.Connection, line 391, in commit __traceback_info__: (('Products.Transience.Transience', 'Increaser'), '\x00\x00\x00\x00\x00\x00\x00\x06', '') * Module Products.TemporaryFolder.TemporaryStorage, line 134, in store
ConflictError: database conflict error (oid 0000000000000006, serial was 0353d48778e55b88, now 0353d485d3dba833)
----------------------------------------------
Any ideas what could be causing this type of behavior?
Thanks,
Kevin
_______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
The traceback indicates that the conflict is coming out of a Transience, which would lead me to think that the catalog isn't at fault. Sounds like more of a sessioning write issue. On Thu, 2004-03-18 at 14:03, Casey Duncan wrote:
It sounds like this system has a high write rate. These writes are causing the catalog to get updated often. This could explain both the poor performance and the errors. When a write occurs while simultaneously performing a query, it can cause a read conflict in the query because one of the objects in the index changed while the query was being processed.
This causes the query to be retried. If another conflict is detected, the query will be retried again (up to 3 times). These retries slow things down a lot and if conflicts continue to happen, they will eventually cause the error you reported.
You say you have 8000 pdfs in a single folder. What type of folder is this? How often are new files added, or old files deleted? If you are using a standard zope folder (not a BTreeFolder), then updates to it will be *very* slow with that many children. Slow transactions like that tend to cause conflicts elsewhere (or get retried themselves alot) which exacerbates the problem.
The query you mention (against two field indexes) should be pretty cheap, however it's unclear how many times it gets executed in the loop. It would be cheaper to do this instead of querying in the loop.
query = context.getDBSubjectForUser(userID=userID) subjects = [q.subject for q in query] docResults = context.portal_catalog( {'Subject' : subjects, 'Type' : 'Template'})
Also, what are you doing with docResults when you get it back? Are you calling getObject on the results returned?
-Casey
On Thu, 18 Mar 2004 13:15:45 -0500 Kevin Carlson <khcarlso@bellsouth.net> wrote:
I have a single folder in a zope instance that contains about 8000 pdf
files. Each of those files is indexed using TextIndexNG (1.09), along with having a Subject and Type index that are both Field indexes. I am using portal_catalog under CMF 1.2 , Zope 2.6.1, python 2.1.3.
Querying the catalog has begun to take much more time recently and I am wondering if there are any things that I can check into regarding what might be slowing things down. I have a simple query returned from a MySQL database that in turn drives a search of the catalog on Subject and Type. I am using the following syntax:
... query = context.getDBSubjectForUser(userID=userID) for q in query : docResults = context.portal_catalog({'Subject' : q.subject, 'Type' :
'Template'}) ...
This query is taking quite a while and occasionally errs out with a ZODB conflict error for certain subjects returned from the database(traceback below) :
----------------------------------------------
Site Error
An error was encountered while publishing this resource.
*ZODB.POSException.ConflictError*
Sorry, a site error occurred.
Traceback (innermost last):
* Module ZPublisher.Publish, line 150, in publish_module * Module ZPublisher.Publish, line 127, in publish * Module ZPublisher.Publish, line 127, in publish * Module ZPublisher.Publish, line 127, in publish * Module ZPublisher.Publish, line 122, in publish * Module Zope.App.startup, line 142, in zpublisher_exception_hook * Module ZPublisher.Publish, line 102, in publish * Module Zope.App.startup, line 200, in commit * Module ZODB.Transaction, line 235, in commit * Module ZODB.Transaction, line 349, in _commit_objects * Module ZODB.Connection, line 391, in commit __traceback_info__: (('Products.Transience.Transience', 'Increaser'), '\x00\x00\x00\x00\x00\x00\x00\x06', '') * Module Products.TemporaryFolder.TemporaryStorage, line 134, in store
ConflictError: database conflict error (oid 0000000000000006, serial was 0353d48778e55b88, now 0353d485d3dba833)
----------------------------------------------
Any ideas what could be causing this type of behavior?
Thanks,
Kevin
_______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
_______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Chris McDonough wrote:
The traceback indicates that the conflict is coming out of a Transience, which would lead me to think that the catalog isn't at fault. Sounds like more of a sessioning write issue.
I guess that I don't understand why sessioning is causing this issue. The product is not writing to the session explicitly so I don't get why there would be these write conflicts. In any event, it looks like sessions might be such a significant performance hit that we should not use them. What alternatives do you suggest instead? Cookies? Other? If sessions aren't the root of all evil, how can I go about solving this session problem? Thanks very much for any advice.
On Fri, 2004-03-19 at 02:02, Kevin Carlson wrote:
Chris McDonough wrote:
I guess that I don't understand why sessioning is causing this issue.
Neither do I, but I haven't really seen the code you're using, so it's not really possible to deduce what's happening.
The product is not writing to the session explicitly so I don't get why there would be these write conflicts.
Are you using Archetypes? I found that it writes to the session on every traversal of one of its objects. Something is using sessions to cause this, although it's hard to know what.
In any event, it looks like sessions might be such a significant performance hit that we should not use them. What alternatives do you suggest instead? Cookies? Other?
Probably no easy solution without trying to figure out what the problem is first.
If sessions aren't the root of all evil, how can I go about solving this session problem?
Find whatever is using sessioning and disable it or work around it is the best answer I can give at the moment. - C
On Fri, 2004-03-19 at 07:46, Kevin Carlson wrote:
Chris McDonough wrote:
On Fri, 2004-03-19 at 02:02, Kevin Carlson wrote:
Chris McDonough wrote:
I guess that I don't understand why sessioning is causing this issue.
Neither do I, but I haven't really seen the code you're using, so it's not really possible to deduce what's happening.
The code that I am using only checks values of variables held in the session. I need field level access control for some of the forms that I am displaying so I pass around a field level access code along with some other user information in the session. I have a bit of code in each of these forms (written in DTML) that checks the session variables and then displays only the items that they are allowed to see. The code never alters the information in the session, it is a read only operation.
FWIW, sessions are inherently not read-only because when they are accessed, even only to read, they do write to the database to expire old sessions. But I did read your other mail where you disabled everything and it still happens so I am stumped. But if you are getting the exact same traceback you reported before, *something* is using sessions or at least Transience, upon which sessions are based. Were I you, and I didn't want to dig into what exactly is causing the sessioning machinery to be invoked, I might try a bit of voodoo by replacing the temporary storage that backs the sessioning database with a filestorage by replacing the <zodb_db temporary> stanza in your zope.conf with: <zodb_db temporary> # Temporary storage database (for sessions) <filestorage> path $INSTANCE/var/Sessions.fs </filestorage> mount-point /temp_folder container-class Products.TemporaryFolder.TemporaryContainer </zodb_db> This will use a FileStorage to back the sessioning database instead of a TemporaryStorage. This database file will grow and grow and grow, unlike a TemporaryStorage, which does garbage collection of unused nodes and stores its data in RAM. You will need to pack the sessioning database every so often. One other thing about this is that you will be able to see the Sessions.fs file grow and that will tell you whether things are using sessioning or not. I'm not sure what else to suggest at the moment other than stepping through all of the code in the code path invoked when you render the page to figure out where sessions are invoked to disable them. - C
Chris McDonough wrote:
But if you are getting the exact same traceback you reported before, *something* is using sessions or at least Transience, upon which sessions are based. Were I you, and I didn't want to dig into what exactly is causing the sessioning machinery to be invoked, I might try a bit of voodoo by replacing the temporary storage that backs the sessioning database with a filestorage by replacing the <zodb_db temporary> stanza in your zope.conf with:
I am using Zope 2.6.1. There's not a zope.conf file in the path. Is this something that comes with an upgrade to 2.7.x?
Kevin Carlson wrote:
If sessions aren't the root of all evil, how can I go about solving this session problem?
I just want to add that I too have had weird exceptions caused by sessions recently. It was while writing a product, so I asumed that I had some of the blame. But my traceback looks rather similar. I have unly had it under Zope 2.7.0, not 2.6.x though I am not shure that I have tested it under the very latest 2.6.x's Win 2k Zope 2.7.0 Plone 2.0 final It was with my mxmDynamicPage product, that is pretty catalog intensive. regards Max M #################################### Site Error An error was encountered while publishing this resource. KeyError Sorry, a site error occurred. Traceback (innermost last): * Module ZPublisher.Publish, line 163, in publish_module_standard * Module Products.PlacelessTranslationService.PatchStringIO, line 45, in new_publish * Module ZPublisher.Publish, line 127, in publish * Module Zope.App.startup, line 203, in zpublisher_exception_hook * Module ZPublisher.Publish, line 100, in publish * Module ZPublisher.mapply, line 88, in mapply * Module ZPublisher.Publish, line 40, in call_object * Module Shared.DC.Scripts.Bindings, line 306, in __call__ * Module Shared.DC.Scripts.Bindings, line 343, in _bindAndExec * Module Products.CMFCore.FSPageTemplate, line 191, in _exec * Module Products.CMFCore.FSPageTemplate, line 124, in pt_render * Module Products.PageTemplates.PageTemplate, line 96, in pt_render <FSPageTemplate at /x/DynamicPage_editLists used for /x/Members/maxm/y> * Module TAL.TALInterpreter, line 189, in __call__ * Module TAL.TALInterpreter, line 233, in interpret * Module TAL.TALInterpreter, line 663, in do_useMacro * Module TAL.TALInterpreter, line 233, in interpret * Module TAL.TALInterpreter, line 408, in do_optTag_tal * Module TAL.TALInterpreter, line 393, in do_optTag * Module TAL.TALInterpreter, line 388, in no_tag * Module TAL.TALInterpreter, line 233, in interpret * Module TAL.TALInterpreter, line 552, in do_insertTranslation * Module TAL.TALInterpreter, line 615, in translate * Module Products.PageTemplates.TALES, line 263, in translate * Module Products.PlacelessTranslationService.PlacelessTranslationService, line 109, in translate * Module Products.PlacelessTranslationService.PlacelessTranslationService, line 407, in translate * Module Products.PlacelessTranslationService.PlacelessTranslationService, line 338, in getCatalogsForTranslation * Module Products.PlacelessTranslationService.PlacelessTranslationService, line 444, in negotiate_language * Module Products.PlacelessTranslationService.Negotiator, line 257, in negotiate * Module Products.PlacelessTranslationService.Negotiator, line 262, in _negotiate * Module Products.PlacelessTranslationService.Negotiator, line 54, in getLangPrefs * Module Products.PlacelessTranslationService.Negotiator, line 168, in getAccepted * Module ZPublisher.HTTPRequest, line 1218, in __getattr__ * Module ZPublisher.HTTPRequest, line 1178, in get * Module Products.Sessions.SessionDataManager, line 93, in getSessionData * Module Products.Sessions.SessionDataManager, line 180, in _getSessionDataObject * Module Products.Transience.Transience, line 176, in new_or_existing * Module Products.Transience.Transience, line 799, in get * Module Products.Transience.Transience, line 548, in _getCurrentBucket * Module ZODB.Connection, line 561, in setstate * Module tempstorage.TemporaryStorage, line 94, in load KeyError: '\x00\x00\x00\x00\x00\x00\x00q' (Also, an error occurred while attempting to render the standard error message.) -- hilsen/regards Max M, Denmark http://www.mxm.dk/ IT's Mad Science
Yes, there are weirdnesses with TemporaryStorage it seems. I'd suggest as an interim fix trying to use a filestorage to back your sessioning database rather than a TemporaryStorage. You will need to pack this database from time to time or risk it growing and filling up your partition (something that TemporaryStorage was designed to avoid as it stores data in RAM and autopacks). Replace the current <zodb_db temporary> stanza with the one that follows to do so. <zodb_db temporary> # Temporary storage database (for sessions) <temporarystorage> name temporary storage for sessioning </temporarystorage> mount-point /temp_folder container-class Products.TemporaryFolder.TemporaryContainer </zodb_db> On Fri, 2004-03-19 at 02:22, Max M wrote:
Kevin Carlson wrote:
If sessions aren't the root of all evil, how can I go about solving this session problem?
I just want to add that I too have had weird exceptions caused by sessions recently.
It was while writing a product, so I asumed that I had some of the blame. But my traceback looks rather similar.
I have unly had it under Zope 2.7.0, not 2.6.x though I am not shure that I have tested it under the very latest 2.6.x's
Win 2k Zope 2.7.0 Plone 2.0 final
It was with my mxmDynamicPage product, that is pretty catalog intensive.
regards Max M
####################################
Site Error
An error was encountered while publishing this resource.
KeyError Sorry, a site error occurred.
Traceback (innermost last):
* Module ZPublisher.Publish, line 163, in publish_module_standard * Module Products.PlacelessTranslationService.PatchStringIO, line 45, in new_publish * Module ZPublisher.Publish, line 127, in publish * Module Zope.App.startup, line 203, in zpublisher_exception_hook * Module ZPublisher.Publish, line 100, in publish * Module ZPublisher.mapply, line 88, in mapply * Module ZPublisher.Publish, line 40, in call_object * Module Shared.DC.Scripts.Bindings, line 306, in __call__ * Module Shared.DC.Scripts.Bindings, line 343, in _bindAndExec * Module Products.CMFCore.FSPageTemplate, line 191, in _exec * Module Products.CMFCore.FSPageTemplate, line 124, in pt_render * Module Products.PageTemplates.PageTemplate, line 96, in pt_render <FSPageTemplate at /x/DynamicPage_editLists used for /x/Members/maxm/y> * Module TAL.TALInterpreter, line 189, in __call__ * Module TAL.TALInterpreter, line 233, in interpret * Module TAL.TALInterpreter, line 663, in do_useMacro * Module TAL.TALInterpreter, line 233, in interpret * Module TAL.TALInterpreter, line 408, in do_optTag_tal * Module TAL.TALInterpreter, line 393, in do_optTag * Module TAL.TALInterpreter, line 388, in no_tag * Module TAL.TALInterpreter, line 233, in interpret * Module TAL.TALInterpreter, line 552, in do_insertTranslation * Module TAL.TALInterpreter, line 615, in translate * Module Products.PageTemplates.TALES, line 263, in translate * Module Products.PlacelessTranslationService.PlacelessTranslationService, line 109, in translate * Module Products.PlacelessTranslationService.PlacelessTranslationService, line 407, in translate * Module Products.PlacelessTranslationService.PlacelessTranslationService, line 338, in getCatalogsForTranslation * Module Products.PlacelessTranslationService.PlacelessTranslationService, line 444, in negotiate_language * Module Products.PlacelessTranslationService.Negotiator, line 257, in negotiate * Module Products.PlacelessTranslationService.Negotiator, line 262, in _negotiate * Module Products.PlacelessTranslationService.Negotiator, line 54, in getLangPrefs * Module Products.PlacelessTranslationService.Negotiator, line 168, in getAccepted * Module ZPublisher.HTTPRequest, line 1218, in __getattr__ * Module ZPublisher.HTTPRequest, line 1178, in get * Module Products.Sessions.SessionDataManager, line 93, in getSessionData * Module Products.Sessions.SessionDataManager, line 180, in _getSessionDataObject * Module Products.Transience.Transience, line 176, in new_or_existing * Module Products.Transience.Transience, line 799, in get * Module Products.Transience.Transience, line 548, in _getCurrentBucket * Module ZODB.Connection, line 561, in setstate * Module tempstorage.TemporaryStorage, line 94, in load
KeyError: '\x00\x00\x00\x00\x00\x00\x00q' (Also, an error occurred while attempting to render the standard error message.)
Casey Duncan wrote: The folder containing the objects is a BTreeFolder2.
The query you mention (against two field indexes) should be pretty cheap, however it's unclear how many times it gets executed in the loop. It would be cheaper to do this instead of querying in the loop.
query = context.getDBSubjectForUser(userID=userID) subjects = [q.subject for q in query] docResults = context.portal_catalog( {'Subject' : subjects, 'Type' : 'Template'})
OK. I can try that. My only issue is that each docResult is going to have to be correlated to an individual user so this may not work well in this situation.
Also, what are you doing with docResults when you get it back? Are you calling getObject on the results returned?
I am not calling getObject. I am just getting the id of the document and passing that to the page displaying the results of the above script. Thanks for the response. Any help is greatly appreciated.
After much trial and error, I have concluded that the issue has nothing to do with sessions. Here is the process by which I came to that conclusion: I removed all of the code in the particular page that used sessions and avoided any pages in the site that would create a session variable by traversing directly to the page in question. I commented out the call to the script that creates to data for the page and the page rendered just fine. Second, I commented out the code in the script that called the catalog and added the call to this script back in the page. The page still rendered fine. If I added the code that queries the catalog back into the script, I get the error. Now for some new strangeness that I noticed. If I keep the catalog queries in the code and run the page from within the ZMI, it renders fine. In fact, it returns almost immediately. That got me thinking about one other item that may cause this. I am using a SiteAccessRule that is checking the domain name in the URL and calling setupSkin. Could that be causing this strange behavior? Kevin
On Fri, Mar 19, 2004 at 08:48:24AM -0500, Kevin Carlson wrote:
Now for some new strangeness that I noticed. If I keep the catalog queries in the code and run the page from within the ZMI, it renders fine. In fact, it returns almost immediately. That got me thinking about one other item that may cause this. I am using a SiteAccessRule that is checking the domain name in the URL and calling setupSkin. Could that be causing this strange behavior?
Maybe. IIRC, there were performance issues with using setupSkin as you describe. I don't remember what the issue was exactly. With CMF 1.3.1 or later, you should use changeSkin instead. An easy way to check would be to uncomment this line in your zope.conf, and then restart: # suppress-all-access-rules on -- Paul Winkler http://www.slinkp.com Look! Up in the sky! It's THE NEW VOLUPTOUS 1-900! (random hero from isometric.spaceninja.com)
Kevin Carlson wrote:
I removed all of the code in the particular page that used sessions and avoided any pages in the site that would create a session variable by traversing directly to the page in question.
I'd only believe it if you hack or Zope so the sessioning machinery raises an error whenenver something tries to write to the temp storage...
about one other item that may cause this. I am using a SiteAccessRule that is checking the domain name in the URL and calling setupSkin. Could that be causing this strange behavior?
setupSkin or something in that area may well be writing to the session :-S Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
Kevin Carlson wrote at 2004-3-18 13:15 -0500:
I have a single folder in a zope instance that contains about 8000 pdf files. Each of those files is indexed using TextIndexNG (1.09), along with having a Subject and Type index that are both Field indexes. I am using portal_catalog under CMF 1.2 , Zope 2.6.1, python 2.1.3.
Querying the catalog has begun to take much more time recently and I am wondering if there are any things that I can check into regarding what might be slowing things down. I have a simple query returned from a MySQL database that in turn drives a search of the catalog on Subject and Type. I am using the following syntax:
... query = context.getDBSubjectForUser(userID=userID) for q in query : docResults = context.portal_catalog({'Subject' : q.subject, 'Type' : 'Template'}) ...
This query is taking quite a while and occasionally errs out with a ZODB conflict error for certain subjects returned from the database(traceback below) :
I suggest you use a profiler (e.g. my "ZopeProfiler") to find out where exactly the time is spent. You can find "ZopeProfiler" at <http://www.dieter.handshake.de/pyprojects/zope> -- Dieter
participants (7)
-
Casey Duncan -
Chris McDonough -
Chris Withers -
Dieter Maurer -
Kevin Carlson -
Max M -
Paul Winkler