-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Andreas Krasa wrote:
Hi Tres,
thank you very much for your reply!
Am 29.11.09 21:57, schrieb Tres Seaver:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
----- Original Message ----- From: "Andreas Krasa" <andreas.krasa@wu-wien.ac.at>
we're right in the process of tracking down the error outside of ZOPE.
We have completely installed a new server from scratch with RHEL 5.4 and have re-installed python 2.4.6 and the latest versions of libxml2 and libxslt there. We double checked the LD config, and made sure that te correct shared objects get loaded (via lsof).
We also reinstalled a few other modules that contain C-code (such as python-ldap) which we need for being able to do authenitcation.
Unfortunately that didn't really help much. We still experience crashes.
Are there any known issues with Zope 2.11.2, LibXML2 and/or LibXSLT that could cause these problems?
The only thing we re-used is the Data.fs, which we have to, because we're talking about a production system here.
Also note, that we have used excatly the same setup for a long time now, even on the same hardware, without any of these troubles. The problems only started when we switched over to a new (and probably more resource-intensive layout).
We're unfortunately still not able to reproduce these crashes. Can you set 'ulimit -c' to get a core file, which might at least help point to the extension which is to blame (although it may just show the "downstream" victim of a heap munge).
What versions of libxml2 / libxslt are you using? How about lxml?
Yes, we did set the ulimit and were indeed able to produce a coredump for each crash happening (each having something between 300 and 700 MB). We tried to debug using "gdb" but unfortunaley they only reveal two cases when the crashes occur:
1) During garbage collection where the gc tries to clean up damaged python objects 2) During some "ceval" process, also related to accessing damaged python objects
Unfortunately it doesn't reveal what exactly trashes the objects. To us it seems that this could happen some time earlier before either of the two processes mentioned above tries to access the objects and crashes ZOPE.
For now, we don't really see a reproduceable pattern as it seems to be a somewhat more complex user behavior which leads to this. We were able to extract a few URLs out of the coredumps but directly accessing those does nothing. Also the last logged access in the Z2.log before the coredump triggers nothing, when directly accessing it.
We're running ZOPE-2.11.2 with an eggified version of ZODB3-3.8.4 plus libxml2-2.7.6, libxslt-1.1.26 and lxml-2.2.4 now, the crashes still happen. Previously we've been running with ZOPE-2.11.2, libxml2-2.7.3, libxslt-1.1.24 and lxml-2.1.5. That also crashed ZOPE occasionally.
Does your application ever use the libxml2 / libxslt Python bindings directly? If so, I would go over that part of your app with a microscope: it is incredibly easy to trigger segfaults from those bindings. If not, then I would look for help on the lxml mailing list.
This only happened since we switched to a new layout (probably in combination with a few minor Silva updates).
By "new layout", to you mean a new site them? If so, how do lxml / libxml2 / lbixslt interact with your application to generate the theme? What is structurally different about the new theme?
We have been using the same system software (RHEL5), hardware, python version and libxml2/libxslt/lxml versions with our old old layout, where everything worked fine for years.
I would be happy to paste any particular gdb outputs if that is of any help...?
I'm afraid that won't help: the GC segfaults indicate somebody is munging the heap way before the segfault is triggered. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAksT65cACgkQ+gerLs4ltQ5swACgsSuScLIAfFtd1d9TMznaQEeu 7JEAoJBetJHX3KOCbinGlyV5F/7DWjqK =qGv5 -----END PGP SIGNATURE-----