[Zope] Frequent ZOPE crashes
Tres Seaver
tseaver at palladion.com
Mon Nov 30 10:58:16 EST 2009
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Andreas Krasa wrote:
> Hi Tres,
>
> thank you very much for your reply!
>
> Am 29.11.09 21:57, schrieb Tres Seaver:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>>>> ----- Original Message ----- From: "Andreas Krasa"
>>>> <andreas.krasa at wu-wien.ac.at>
>>>>
>>> we're right in the process of tracking down the error outside of ZOPE.
>>>
>>> We have completely installed a new server from scratch with RHEL 5.4 and
>>> have re-installed python 2.4.6 and the latest versions of libxml2 and
>>> libxslt there. We double checked the LD config, and made sure that te
>>> correct shared objects get loaded (via lsof).
>>>
>>> We also reinstalled a few other modules that contain C-code (such as
>>> python-ldap) which we need for being able to do authenitcation.
>>>
>>> Unfortunately that didn't really help much. We still experience crashes.
>>>
>>> Are there any known issues with Zope 2.11.2, LibXML2 and/or LibXSLT that
>>> could cause these problems?
>>>
>>> The only thing we re-used is the Data.fs, which we have to, because
>>> we're talking about a production system here.
>>>
>>> Also note, that we have used excatly the same setup for a long time now,
>>> even on the same hardware, without any of these troubles. The problems
>>> only started when we switched over to a new (and probably more
>>> resource-intensive layout).
>>>
>>> We're unfortunately still not able to reproduce these crashes.
>> Can you set 'ulimit -c' to get a core file, which might at least help
>> point to the extension which is to blame (although it may just show the
>> "downstream" victim of a heap munge).
>>
>> What versions of libxml2 / libxslt are you using? How about lxml?
>
> Yes, we did set the ulimit and were indeed able to produce a coredump
> for each crash happening (each having something between 300 and 700 MB).
> We tried to debug using "gdb" but unfortunaley they only reveal two
> cases when the crashes occur:
>
> 1) During garbage collection where the gc tries to clean up damaged
> python objects
> 2) During some "ceval" process, also related to accessing damaged python
> objects
>
> Unfortunately it doesn't reveal what exactly trashes the objects. To us
> it seems that this could happen some time earlier before either of the
> two processes mentioned above tries to access the objects and crashes ZOPE.
>
> For now, we don't really see a reproduceable pattern as it seems to be a
> somewhat more complex user behavior which leads to this. We were able to
> extract a few URLs out of the coredumps but directly accessing those
> does nothing. Also the last logged access in the Z2.log before the
> coredump triggers nothing, when directly accessing it.
>
> We're running ZOPE-2.11.2 with an eggified version of ZODB3-3.8.4 plus
> libxml2-2.7.6, libxslt-1.1.26 and lxml-2.2.4 now, the crashes still
> happen. Previously we've been running with ZOPE-2.11.2, libxml2-2.7.3,
> libxslt-1.1.24 and lxml-2.1.5. That also crashed ZOPE occasionally.
Does your application ever use the libxml2 / libxslt Python bindings
directly? If so, I would go over that part of your app with a
microscope: it is incredibly easy to trigger segfaults from those
bindings. If not, then I would look for help on the lxml mailing list.
> This only happened since we switched to a new layout (probably in
> combination with a few minor Silva updates).
By "new layout", to you mean a new site them? If so, how do lxml /
libxml2 / lbixslt interact with your application to generate the theme?
What is structurally different about the new theme?
> We have been using the same system software (RHEL5), hardware, python
> version and libxml2/libxslt/lxml versions with our old old layout, where
> everything worked fine for years.
>
> I would be happy to paste any particular gdb outputs if that is of any
> help...?
I'm afraid that won't help: the GC segfaults indicate somebody is
munging the heap way before the segfault is triggered.
Tres.
- --
===================================================================
Tres Seaver +1 540-429-0999 tseaver at palladion.com
Palladion Software "Excellence by Design" http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAksT65cACgkQ+gerLs4ltQ5swACgsSuScLIAfFtd1d9TMznaQEeu
7JEAoJBetJHX3KOCbinGlyV5F/7DWjqK
=qGv5
-----END PGP SIGNATURE-----
More information about the Zope
mailing list