Hi there in the last days my Zope server seems to hang and the CPU is almost every time at 100 %. I installed DeadlockDebugger and see a lot of these entries: ----------------------------------------------------------------------------- File "/usr/local/Zope-2.10.5/lib/python/OFS/Traversable.py", line 288, in unrestrictedTraverse return obj File "/usr/local/Zope-2.10.5/lib/python/ZODB/Connection.py", line 767, in setstate raise File "/usr/local/Zope-2.10.5/lib/python/ZODB/Connection.py", line 820, in _setstate obj._p_serial = serial File "/usr/local/Zope-2.10.5/lib/python/ZODB/serialize.py", line 605, in setGhostState obj.__setstate__(state) File "/usr/local/Zope-2.10.5/lib/python/ZODB/serialize.py", line 597, in getState return unpickler.load() File "/usr/local/Zope-2.10.5/lib/python/ZODB/serialize.py", line 471, in _persistent_load return self.load_oid(reference) File "/usr/local/Zope-2.10.5/lib/python/ZODB/serialize.py", line 537, in load_oid return self._conn.get(oid) File "/usr/local/Zope-2.10.5/lib/python/ZODB/Connection.py", line 225, in get return obj File "/usr/local/Zope-2.10.5/lib/python/ZODB/FileStorage/FileStorage.py", line 562, in load self._lock_release() File "/usr/local/lib/python2.4/threading.py", line 109, in acquire return rc ----------------------------------------------------------------------------- I am wondering what the last line File "/usr/local/lib/python2.4/threading.py", line 109, in acquire return rc does exactly mean. Are the threads blocked? My configuration: -Zope 2.10.5 running with 8 threads on a 2-CPU machine with 8 GB of RAM -Python 2.4.4 -LDAPUserfolder -Silva CMS 2.0.8 -ZODB size: ~6 Gbyte -Only 6-10 people work in the CMS and no public access (spy bots, crawlers) access the Zope server Regards Nico
Yes, and you can use the Control_Panel/DebugInfo as well and scroll down to the Connections table. Out of the 8 threads most of them should be vacant (ie. None) and if there are some requests logged there they should in the range of < 1 sec. If any of them are longer than 1 sec they are either stuck or doing something is supposed to take a long time (e.g. render a complex report, do a big upload, etc.) I once had a similar problem and I hacked the LDAPUserFolder to do a timing of the connections and found that connecting to the LDAP server was super slow and thus each connection cost too much time and it subsequently blocked the threads. Peter 2008/9/30 Nico Grubert <nicogrubert@gmail.com>:
Hi there
in the last days my Zope server seems to hang and the CPU is almost every time at 100 %.
I installed DeadlockDebugger and see a lot of these entries: ----------------------------------------------------------------------------- File "/usr/local/Zope-2.10.5/lib/python/OFS/Traversable.py", line 288, in unrestrictedTraverse return obj File "/usr/local/Zope-2.10.5/lib/python/ZODB/Connection.py", line 767, in setstate raise File "/usr/local/Zope-2.10.5/lib/python/ZODB/Connection.py", line 820, in _setstate obj._p_serial = serial File "/usr/local/Zope-2.10.5/lib/python/ZODB/serialize.py", line 605, in setGhostState obj.__setstate__(state) File "/usr/local/Zope-2.10.5/lib/python/ZODB/serialize.py", line 597, in getState return unpickler.load() File "/usr/local/Zope-2.10.5/lib/python/ZODB/serialize.py", line 471, in _persistent_load return self.load_oid(reference) File "/usr/local/Zope-2.10.5/lib/python/ZODB/serialize.py", line 537, in load_oid return self._conn.get(oid) File "/usr/local/Zope-2.10.5/lib/python/ZODB/Connection.py", line 225, in get return obj File "/usr/local/Zope-2.10.5/lib/python/ZODB/FileStorage/FileStorage.py", line 562, in load self._lock_release() File "/usr/local/lib/python2.4/threading.py", line 109, in acquire return rc -----------------------------------------------------------------------------
I am wondering what the last line File "/usr/local/lib/python2.4/threading.py", line 109, in acquire return rc does exactly mean. Are the threads blocked?
My configuration: -Zope 2.10.5 running with 8 threads on a 2-CPU machine with 8 GB of RAM -Python 2.4.4 -LDAPUserfolder -Silva CMS 2.0.8 -ZODB size: ~6 Gbyte -Only 6-10 people work in the CMS and no public access (spy bots, crawlers) access the Zope server
Regards Nico _______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
-- Peter Bengtsson, work www.fry-it.com home www.peterbe.com hobby www.issuetrackerproduct.com
Yes, and you can use the Control_Panel/DebugInfo as well and scroll down to the Connections table. Out of the 8 threads most of them should be vacant (ie. None) and if there are some requests logged there they should in the range of < 1 sec. If any of them are longer than 1 sec they are either stuck or doing something is supposed to take a long time (e.g. render a complex report, do a big upload, etc.)
I once had a similar problem and I hacked the LDAPUserFolder to do a timing of the connections and found that connecting to the LDAP server was super slow and thus each connection cost too much time and it subsequently blocked the threads.
Hi Peter thanks for your reply. At Control_Panel/DebugInfo I see --------------------------------------------------------------------------- None (44921) Tue Sep 30 13:24:18 2008 (0.37s) (... 'ACTUAL_URL': 'http://fog:8086/Control_Panel/DebugInfo/manage_main'}) (641) None (57523) None (69795) None (47637) Tue Sep 30 13:24:17 2008 (1.18s) (... path/to/any/doc')}) (48335) --------------------------------------------------------------------------- I sometimes see a LDAP call in the DeadlockProfiler output so it could be the reason. Do you still have the LDAP patch around? I'd like to do a timing of the connections, too. Regards Nico
2008/9/30 Nico Grubert <nicogrubert@gmail.com>:
Yes, and you can use the Control_Panel/DebugInfo as well and scroll down to the Connections table. Out of the 8 threads most of them should be vacant (ie. None) and if there are some requests logged there they should in the range of < 1 sec. If any of them are longer than 1 sec they are either stuck or doing something is supposed to take a long time (e.g. render a complex report, do a big upload, etc.)
I once had a similar problem and I hacked the LDAPUserFolder to do a timing of the connections and found that connecting to the LDAP server was super slow and thus each connection cost too much time and it subsequently blocked the threads.
Hi Peter
thanks for your reply.
At Control_Panel/DebugInfo I see --------------------------------------------------------------------------- None (44921) Tue Sep 30 13:24:18 2008 (0.37s) (... 'ACTUAL_URL': 'http://fog:8086/Control_Panel/DebugInfo/manage_main'}) (641) None (57523) None (69795) None (47637) Tue Sep 30 13:24:17 2008 (1.18s) (... path/to/any/doc')}) (48335) ---------------------------------------------------------------------------
I sometimes see a LDAP call in the DeadlockProfiler output so it could be the reason.
Do you still have the LDAP patch around? I'd like to do a timing of the connections, too.
It was work on at a client's site. But I think you should look for somehing like this: conn.simple_bind_s(user_dn, user_pwd) in LDAPDelegate.py so it looks something like this: import time T0 = time.time() conn.simple_bind_s(user_dn, user_pwd) T1 = time.time() open('/tmp/timings.log','a').write('%s\n' % (T1-T0))
Regards Nico _______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
-- Peter Bengtsson, work www.fry-it.com home www.peterbe.com hobby www.issuetrackerproduct.com
Nico Grubert wrote at 2008-9-30 11:49 +0200:
in the last days my Zope server seems to hang and the CPU is almost every time at 100 %.
When your CPU is almost at 100 % over an extended period, then you do not have a typical "hanging" problem but rather something like a busy loop.
I installed DeadlockDebugger and see a lot of these entries: ... File "/usr/local/Zope-2.10.5/lib/python/ZODB/FileStorage/FileStorage.py", line 562, in load self._lock_release() File "/usr/local/lib/python2.4/threading.py", line 109, in acquire return rc
These entries should not be the one that cause your problems (they are not computationally expensive). Do you see other entries as well?
-----------------------------------------------------------------------------
I am wondering what the last line File "/usr/local/lib/python2.4/threading.py", line 109, in acquire return rc does exactly mean. Are the threads blocked?
It is indeed strange that a "_lock_release" should end in an "acquire". Delete all "*.pyc" and "*.pyo" files and see whether these peculiarity disappear. -- Dieter
These entries should not be the one that cause your problems (they are not computationally expensive). Do you see other entries as well?
Thank you very much for your answer, Dieter. There are 2 entries from in the DeadlockDebugger's output that may block all other threads. Here is the full output of both entries: 1. --------------------------------------------------------------------------- Thread 1098918208 (GET /VirtualHostBase/http/intra:80/intranet/VirtualHostRoot/projects/index_DE): File "/usr/local/Zope-2.10.5/lib/python/ZServer/PubCore/ZServerPublisher.py", line 25, in __init__ response=b) File "/usr/local/Zope-2.10.5/lib/python/ZPublisher/Publish.py", line 401, in publish_module environ, debug, request, response) File "/usr/local/Zope-2.10.5/lib/python/ZPublisher/Publish.py", line 202, in publish_module_standard response = publish(request, module_name, after_list, debug=debug) File "/zope/intranet/Products/PlacelessTranslationService/PatchStringIO.py", line 51, in new_publish x = Publish.old_publish(request, module_name, after_list, debug) File "/usr/local/Zope-2.10.5/lib/python/ZPublisher/Publish.py", line 119, in publish request, bind=1) File "/usr/local/Zope-2.10.5/lib/python/ZPublisher/mapply.py", line 88, in mapply if debug is not None: return debug(object,args,context) File "/usr/local/Zope-2.10.5/lib/python/ZPublisher/Publish.py", line 42, in call_object result=apply(object,args) # Type s<cr> to step into published object. File "/usr/local/Zope-2.10.5/lib/python/Shared/DC/Scripts/Bindings.py", line 313, in __call__ return self._bindAndExec(args, kw, None) File "/usr/local/Zope-2.10.5/lib/python/Shared/DC/Scripts/Bindings.py", line 350, in _bindAndExec return self._exec(bound_data, args, kw) File "/usr/local/Zope-2.10.5/lib/python/Products/PythonScripts/PythonScript.py", line 327, in _exec result = f(*args, **kw) File "Script (Python)", line 20, in index_html File "/usr/local/Zope-2.10.5/lib/python/Shared/DC/Scripts/Bindings.py", line 313, in __call__ return self._bindAndExec(args, kw, None) File "/usr/local/Zope-2.10.5/lib/python/Shared/DC/Scripts/Bindings.py", line 350, in _bindAndExec return self._exec(bound_data, args, kw) File "/usr/local/Zope-2.10.5/lib/python/Products/PageTemplates/ZopePageTemplate.py", line 330, in _exec result = self.pt_render(extra_context=bound_names) File "/usr/local/Zope-2.10.5/lib/python/Products/PageTemplates/ZopePageTemplate.py", line 426, in pt_render result = PageTemplate.pt_render(self, source, extra_context) File "/usr/local/Zope-2.10.5/lib/python/Products/PageTemplates/PageTemplate.py", line 89, in pt_render return super(PageTemplate, self).pt_render(c, source=source) File "/usr/local/Zope-2.10.5/lib/python/zope/pagetemplate/pagetemplate.py", line 117, in pt_render strictinsert=0, sourceAnnotations=sourceAnnotations)() File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 271, in __call__ self.interpret(self.program) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 346, in interpret handlers[opcode](self, args) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 891, in do_useMacro self.interpret(macro) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 346, in interpret handlers[opcode](self, args) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 949, in do_defineSlot self.interpret(slot) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 346, in interpret handlers[opcode](self, args) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 534, in do_optTag_tal self.no_tag(stuff[-2], stuff[-1]) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 516, in no_tag self.interpret(program) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 346, in interpret handlers[opcode](self, args) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 586, in do_setLocal_tal self.engine.setLocal(name, self.engine.evaluateValue(expr)) File "/usr/local/Zope-2.10.5/lib/python/zope/tales/tales.py", line 696, in evaluate return expression(self) File "/usr/local/Zope-2.10.5/lib/python/Products/PageTemplates/ZRPythonExpr.py", line 49, in __call__ return eval(self._code, vars, {}) File "PythonExpr", line 1, in <expression> File "/usr/local/Zope-2.10.5/lib/python/Shared/DC/Scripts/Bindings.py", line 313, in __call__ return self._bindAndExec(args, kw, None) File "/usr/local/Zope-2.10.5/lib/python/Shared/DC/Scripts/Bindings.py", line 350, in _bindAndExec return self._exec(bound_data, args, kw) File "/usr/local/Zope-2.10.5/lib/python/Products/PythonScripts/PythonScript.py", line 327, in _exec result = f(*args, **kw) File "Script (Python)", line 11, in getBoxes_lokal File "/zope/intranet/Products/Silva/Folder.py", line 718, in get_public_tree self._get_public_tree_helper(l, 0, depth) File "/zope/intranet/Products/Silva/Folder.py", line 768, in _get_public_tree_helper if self.service_toc_filter.filter(item): File "/zope/intranet/Products/Silva/tocfilter.py", line 36, in filter return False File "/zope/intranet/Products/Silva/tocfilter.py", line 19, in hideFromTOC return context.service_metadata.getMetadataValue( File "/zope/intranet/Products/SilvaMetadata/MetadataTool.py", line 180, in getMetadataValue return binding.get(set_id, element_id) File "/zope/intranet/Products/SilvaMetadata/Binding.py", line 301, in get return data.get(element_id) File "/zope/intranet/Products/SilvaMetadata/Binding.py", line 500, in _getData return data File "/zope/intranet/Products/SilvaMetadata/Element.py", line 207, in getDefault return self.field.get_value('default', content=content) File "/zope/intranet/Products/Formulator/Field.py", line 142, in get_value return value File "/zope/intranet/Products/Formulator/TALESField.py", line 44, in __call__ return getEngine().getContext(kw).evaluate(expr) File "/usr/local/Zope-2.10.5/lib/python/zope/tales/tales.py", line 696, in evaluate return expression(self) File "/usr/local/Zope-2.10.5/lib/python/Products/PageTemplates/ZRPythonExpr.py", line 49, in __call__ return eval(self._code, vars, {}) File "PythonExpr", line 1, in <expression> File "/zope/intranet/Products/Silva/Security.py", line 272, in sec_get_creator_info return self.sec_get_member(self.getOwner().getId()) File "/usr/local/Zope-2.10.5/lib/python/AccessControl/Owned.py", line 104, in getOwner return user File "/zope/intranet/Products/LDAPUserFolder/LDAPUserFolder.py", line 740, in getUserById return user File "/zope/intranet/Products/LDAPUserFolder/LDAPUserFolder.py", line 720, in getUserByAttr return user_obj File "/zope/intranet/Products/LDAPUserFolder/LDAPUserFolder.py", line 313, in _lookupuserbyattr return roles, dn, user_attrs, groups File "/zope/intranet/Products/LDAPUserFolder/LDAPDelegate.py", line 403, in search return result File "/zope/intranet/Products/LDAPUserFolder/LDAPDelegate.py", line 193, in connect return conn File "/zope/intranet/Products/LDAPUserFolder/SharedResource.py", line 114, in __call__ self._lock.release() File "/usr/local/lib/python2.4/site-packages/ldap/ldapobject.py", line 759, in simple_bind_s return SimpleLDAPObject.simple_bind_s(self,*args,**kwargs) File "/usr/local/lib/python2.4/site-packages/ldap/ldapobject.py", line 176, in simple_bind_s return self.result(msgid,all=1,timeout=self.timeout) File "/usr/local/lib/python2.4/site-packages/ldap/ldapobject.py", line 169, in simple_bind return self._ldap_call(self._l.simple_bind,who,cred,EncodeControlTuples(serverctrls),EncodeControlTuples(clientctrls)) File "/usr/local/lib/python2.4/site-packages/ldap/ldapobject.py", line 104, in _ldap_call return result --------------------------------------------------------------------------- 2. --------------------------------------------------------------------------- Thread 1115703616 (GET /VirtualHostBase/http/intra:80/intranet/VirtualHostRoot/projects/index_DE): File "/usr/local/Zope-2.10.5/lib/python/ZServer/PubCore/ZServerPublisher.py", line 25, in __init__ response=b) File "/usr/local/Zope-2.10.5/lib/python/ZPublisher/Publish.py", line 401, in publish_module environ, debug, request, response) File "/usr/local/Zope-2.10.5/lib/python/ZPublisher/Publish.py", line 202, in publish_module_standard response = publish(request, module_name, after_list, debug=debug) File "/zope/intranet/Products/PlacelessTranslationService/PatchStringIO.py", line 51, in new_publish x = Publish.old_publish(request, module_name, after_list, debug) File "/usr/local/Zope-2.10.5/lib/python/ZPublisher/Publish.py", line 119, in publish request, bind=1) File "/usr/local/Zope-2.10.5/lib/python/ZPublisher/mapply.py", line 88, in mapply if debug is not None: return debug(object,args,context) File "/usr/local/Zope-2.10.5/lib/python/ZPublisher/Publish.py", line 42, in call_object result=apply(object,args) # Type s<cr> to step into published object. File "/usr/local/Zope-2.10.5/lib/python/Shared/DC/Scripts/Bindings.py", line 313, in __call__ return self._bindAndExec(args, kw, None) File "/usr/local/Zope-2.10.5/lib/python/Shared/DC/Scripts/Bindings.py", line 350, in _bindAndExec return self._exec(bound_data, args, kw) File "/usr/local/Zope-2.10.5/lib/python/Products/PythonScripts/PythonScript.py", line 327, in _exec result = f(*args, **kw) File "Script (Python)", line 20, in index_html File "/usr/local/Zope-2.10.5/lib/python/Shared/DC/Scripts/Bindings.py", line 313, in __call__ return self._bindAndExec(args, kw, None) File "/usr/local/Zope-2.10.5/lib/python/Shared/DC/Scripts/Bindings.py", line 350, in _bindAndExec return self._exec(bound_data, args, kw) File "/usr/local/Zope-2.10.5/lib/python/Products/PageTemplates/ZopePageTemplate.py", line 330, in _exec result = self.pt_render(extra_context=bound_names) File "/usr/local/Zope-2.10.5/lib/python/Products/PageTemplates/ZopePageTemplate.py", line 426, in pt_render result = PageTemplate.pt_render(self, source, extra_context) File "/usr/local/Zope-2.10.5/lib/python/Products/PageTemplates/PageTemplate.py", line 89, in pt_render return super(PageTemplate, self).pt_render(c, source=source) File "/usr/local/Zope-2.10.5/lib/python/zope/pagetemplate/pagetemplate.py", line 117, in pt_render strictinsert=0, sourceAnnotations=sourceAnnotations)() File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 271, in __call__ self.interpret(self.program) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 346, in interpret handlers[opcode](self, args) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 891, in do_useMacro self.interpret(macro) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 346, in interpret handlers[opcode](self, args) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 949, in do_defineSlot self.interpret(slot) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 346, in interpret handlers[opcode](self, args) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 534, in do_optTag_tal self.no_tag(stuff[-2], stuff[-1]) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 516, in no_tag self.interpret(program) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 346, in interpret handlers[opcode](self, args) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 586, in do_setLocal_tal self.engine.setLocal(name, self.engine.evaluateValue(expr)) File "/usr/local/Zope-2.10.5/lib/python/zope/tales/tales.py", line 696, in evaluate return expression(self) File "/usr/local/Zope-2.10.5/lib/python/Products/PageTemplates/ZRPythonExpr.py", line 49, in __call__ return eval(self._code, vars, {}) File "PythonExpr", line 1, in <expression> File "/usr/local/Zope-2.10.5/lib/python/Shared/DC/Scripts/Bindings.py", line 313, in __call__ return self._bindAndExec(args, kw, None) File "/usr/local/Zope-2.10.5/lib/python/Shared/DC/Scripts/Bindings.py", line 350, in _bindAndExec return self._exec(bound_data, args, kw) File "/usr/local/Zope-2.10.5/lib/python/Products/PythonScripts/PythonScript.py", line 327, in _exec result = f(*args, **kw) File "/zope/intranet/Products/Silva/Folder.py", line 719, in get_public_tree return l File "/zope/intranet/Products/Silva/Folder.py", line 776, in _get_public_tree_helper l.append((indent, item)) File "/zope/intranet/Products/Silva/tocfilter.py", line 36, in filter return False File "/zope/intranet/Products/Silva/tocfilter.py", line 19, in hideFromTOC return context.service_metadata.getMetadataValue( File "/zope/intranet/Products/SilvaMetadata/MetadataTool.py", line 180, in getMetadataValue return binding.get(set_id, element_id) File "/zope/intranet/Products/SilvaMetadata/Binding.py", line 301, in get return data.get(element_id) File "/zope/intranet/Products/SilvaMetadata/Binding.py", line 500, in _getData return data File "/zope/intranet/Products/SilvaMetadata/Element.py", line 207, in getDefault return self.field.get_value('default', content=content) File "/zope/intranet/Products/Formulator/Field.py", line 142, in get_value return value File "/zope/intranet/Products/Formulator/TALESField.py", line 44, in __call__ return getEngine().getContext(kw).evaluate(expr) File "/usr/local/Zope-2.10.5/lib/python/zope/tales/tales.py", line 696, in evaluate return expression(self) File "/usr/local/Zope-2.10.5/lib/python/Products/PageTemplates/ZRPythonExpr.py", line 49, in __call__ return eval(self._code, vars, {}) File "PythonExpr", line 1, in <expression> File "/zope/intranet/Products/Silva/Security.py", line 272, in sec_get_creator_info return self.sec_get_member(self.getOwner().getId()) File "/zope/intranet/Products/Silva/Security.py", line 227, in sec_get_member return self.service_members.get_cached_member(userid) File "/zope/intranet/Products/Silva/SimpleMembership.py", line 193, in get_cached_member return cloneMember(self.get_member(userid)).__of__(self) File "/zope/intranet/Products/Silva/Membership.py", line 184, in cloneMember editor=member.editor()) File "/zope/intranet/Products/Silva/SimpleMembership.py", line 111, in editor return getattr(self, '_editor', self.default_editor()) File "/zope/intranet/Products/Silva/SimpleMembership.py", line 118, in default_editor return 'kupu' File "/zope/intranet/Products/FileSystemSite/DirectoryView.py", line 426, in __of__ return res File "/zope/intranet/Products/FileSystemSite/DirectoryView.py", line 177, in getContents return self.data, self.objects File "/zope/intranet/Products/FileSystemSite/DirectoryView.py", line 156, in _changed return 0 --------------------------------------------------------------------------- After I called the URL that is shown in those 2 threads (/VirtualHostBase/http/intra:80/intranet/VirtualHostRoot/projects/index_DE) these entry will not disappear from the DeadlockDebugger's output. Maybe they cause the busy loop. I reaööy don't know which part excactly cause the loop since there are some Silva-CMS-specific calls. Regards Nico
Nico Grubert wrote at 2008-10-2 10:13 +0200:
These entries should not be the one that cause your problems (they are not computationally expensive). Do you see other entries as well?
Thank you very much for your answer, Dieter. There are 2 entries from in the DeadlockDebugger's output that may block all other threads.
If your tracebacks are reliable, then neither of these entries should cause "hanging" (apart from this: CPU usage of about 100 percent does not indicate hanging but some other problem). Tracebacks do not need to be reliable. They can be unreliable when the the "*.pyc/*.pyo" files do not correspond to the source. Therefore, an initial step may be to delete all "*.pyc/*.pyo" files and try again. There is one case when "hanging" could apparently happen in a "return": the operation before takes a long time and does not release the GIL. Then, the deadlock debugger will get active only when the operation finished -- and then sees an apparently harmless operation (such as a "return"). We have had severe garbage collection problems (GC, too, holds the GIL for a long time). To detect such problems, we have added a GIL monitor to our system. This is an additional thread running in the system which schedules itself at regular times and measures the difference between the expected scheduling time and the actual activation time. If the difference is too large, this is logged. If your system gets slow but still occasionally responds, you can use "ZopeProfiler" to find out, where the time is spend. Ensure to use "real time" as the timer (not cpu time). -- Dieter
If your tracebacks are reliable, then neither of these entries should cause "hanging" (apart from this: CPU usage of about 100 percent does not indicate hanging but some other problem).
Tracebacks do not need to be reliable. They can be unreliable when the the "*.pyc/*.pyo" files do not correspond to the source. Therefore, an initial step may be to delete all "*.pyc/*.pyo" files and try again.
There is one case when "hanging" could apparently happen in a "return": the operation before takes a long time and does not release the GIL. Then, the deadlock debugger will get active only when the operation finished -- and then sees an apparently harmless operation (such as a "return").
We have had severe garbage collection problems (GC, too, holds the GIL for a long time). To detect such problems, we have added a GIL monitor to our system. This is an additional thread running in the system which schedules itself at regular times and measures the difference between the expected scheduling time and the actual activation time. If the difference is too large, this is logged.
If your system gets slow but still occasionally responds, you can use "ZopeProfiler" to find out, where the time is spend. Ensure to use "real time" as the timer (not cpu time).
Hi Dieter thank you very much for the information. This morning the server did not respond and all 8 threads in the DeadlockDebugger's output look the same: Thread 1098918208 File "/usr/local/Zope-2.10.5/lib/python/ZServer/PubCore/ZServerPublisher.py", line 25, in __init__ response=b) File "/usr/local/Zope-2.10.5/lib/python/ZPublisher/Publish.py", line 401, in publish_module environ, debug, request, response) File "/usr/local/Zope-2.10.5/lib/python/ZPublisher/Publish.py", line 202, in publish_module_standard response = publish(request, module_name, after_list, debug=debug) File "/zope/intranet/Products/PlacelessTranslationService/PatchStringIO.py", line 51, in new_publish x = Publish.old_publish(request, module_name, after_list, debug) File "/usr/local/Zope-2.10.5/lib/python/ZPublisher/Publish.py", line 119, in publish request, bind=1) File "/usr/local/Zope-2.10.5/lib/python/ZPublisher/mapply.py", line 88, in mapply if debug is not None: return debug(object,args,context) File "/usr/local/Zope-2.10.5/lib/python/ZPublisher/Publish.py", line 42, in call_object result=apply(object,args) # Type s<cr> to step into published object. File "/usr/local/Zope-2.10.5/lib/python/Shared/DC/Scripts/Bindings.py", line 313, in __call__ return self._bindAndExec(args, kw, None) File "/usr/local/Zope-2.10.5/lib/python/Shared/DC/Scripts/Bindings.py", line 350, in _bindAndExec return self._exec(bound_data, args, kw) File "/usr/local/Zope-2.10.5/lib/python/Products/PythonScripts/PythonScript.py", line 327, in _exec result = f(*args, **kw) File "Script (Python)", line 20, in index_html File "/usr/local/Zope-2.10.5/lib/python/Shared/DC/Scripts/Bindings.py", line 313, in __call__ return self._bindAndExec(args, kw, None) File "/usr/local/Zope-2.10.5/lib/python/Shared/DC/Scripts/Bindings.py", line 350, in _bindAndExec return self._exec(bound_data, args, kw) File "/usr/local/Zope-2.10.5/lib/python/Products/PageTemplates/ZopePageTemplate.py", line 330, in _exec result = self.pt_render(extra_context=bound_names) File "/usr/local/Zope-2.10.5/lib/python/Products/PageTemplates/ZopePageTemplate.py", line 426, in pt_render result = PageTemplate.pt_render(self, source, extra_context) File "/zope/wwnet/Products/PTProfiler/ProfilerPatch.py", line 66, in __patched_render__ ret = self._patching_class._org_method(self, source, extra_context) File "/usr/local/Zope-2.10.5/lib/python/Products/PageTemplates/PageTemplate.py", line 89, in pt_render return super(PageTemplate, self).pt_render(c, source=source) File "/usr/local/Zope-2.10.5/lib/python/zope/pagetemplate/pagetemplate.py", line 117, in pt_render strictinsert=0, sourceAnnotations=sourceAnnotations)() File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 271, in __call__ self.interpret(self.program) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 346, in interpret handlers[opcode](self, args) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 891, in do_useMacro self.interpret(macro) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 346, in interpret handlers[opcode](self, args) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 949, in do_defineSlot self.interpret(slot) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 346, in interpret handlers[opcode](self, args) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 534, in do_optTag_tal self.no_tag(stuff[-2], stuff[-1]) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 516, in no_tag self.interpret(program) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 346, in interpret handlers[opcode](self, args) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 855, in do_condition self.interpret(block) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 346, in interpret handlers[opcode](self, args) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 536, in do_optTag_tal self.do_optTag(stuff) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 521, in do_optTag return self.no_tag(start, program) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 516, in no_tag self.interpret(program) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 346, in interpret handlers[opcode](self, args) File "/usr/local/Zope-2.10.5/lib/python/zope/tal/talinterpreter.py", line 586, in do_setLocal_tal self.engine.setLocal(name, self.engine.evaluateValue(expr)) File "/usr/local/Zope-2.10.5/lib/python/zope/tales/tales.py", line 696, in evaluate return expression(self) File "/zope/wwnet/Products/PTProfiler/ProfilerPatch.py", line 25, in __patched_call__ ret = self._patching_class._org_method(self, econtext) File "/usr/local/Zope-2.10.5/lib/python/Products/PageTemplates/ZRPythonExpr.py", line 49, in __call__ return eval(self._code, vars, {}) File "PythonExpr", line 1, in <expression> File "/usr/local/Zope-2.10.5/lib/python/Shared/DC/Scripts/Bindings.py", line 313, in __call__ return self._bindAndExec(args, kw, None) File "/usr/local/Zope-2.10.5/lib/python/Shared/DC/Scripts/Bindings.py", line 350, in _bindAndExec return self._exec(bound_data, args, kw) File "/usr/local/Zope-2.10.5/lib/python/Products/PythonScripts/PythonScript.py", line 327, in _exec result = f(*args, **kw) File "Script (Python)", line 109, in get_navigation_items File "/usr/local/Zope-2.10.5/lib/python/Shared/DC/Scripts/Bindings.py", line 313, in __call__ return self._bindAndExec(args, kw, None) File "/usr/local/Zope-2.10.5/lib/python/Shared/DC/Scripts/Bindings.py", line 350, in _bindAndExec return self._exec(bound_data, args, kw) File "/usr/local/Zope-2.10.5/lib/python/Products/PythonScripts/PythonScript.py", line 327, in _exec result = f(*args, **kw) File "Script (Python)", line 31, in getSubdocumentsIncludingGhosts File "/zope/intranet/Products/Silva/Folder.py", line 718, in get_public_tree self._get_public_tree_helper(l, 0, depth) File "/zope/intranet/Products/Silva/Folder.py", line 768, in _get_public_tree_helper if self.service_toc_filter.filter(item): File "/zope/intranet/Products/Silva/tocfilter.py", line 36, in filter return False File "/zope/intranet/Products/Silva/tocfilter.py", line 19, in hideFromTOC return context.service_metadata.getMetadataValue( File "/zope/intranet/Products/SilvaMetadata/MetadataTool.py", line 180, in getMetadataValue return binding.get(set_id, element_id) File "/zope/intranet/Products/SilvaMetadata/Binding.py", line 301, in get return data.get(element_id) File "/zope/intranet/Products/SilvaMetadata/Binding.py", line 500, in _getData return data File "/zope/intranet/Products/SilvaMetadata/Element.py", line 207, in getDefault return self.field.get_value('default', content=content) File "/zope/intranet/Products/Formulator/Field.py", line 142, in get_value return value File "/zope/intranet/Products/Formulator/TALESField.py", line 44, in __call__ return getEngine().getContext(kw).evaluate(expr) File "/usr/local/Zope-2.10.5/lib/python/zope/tales/tales.py", line 696, in evaluate return expression(self) File "/zope/wwnet/Products/PTProfiler/ProfilerPatch.py", line 31, in __patched_call__ return ret File "/usr/local/Zope-2.10.5/lib/python/Products/PageTemplates/ZRPythonExpr.py", line 49, in __call__ return eval(self._code, vars, {}) File "PythonExpr", line 1, in <expression> File "/zope/intranet/Products/Silva/Security.py", line 272, in sec_get_creator_info return self.sec_get_member(self.getOwner().getId()) File "/zope/intranet/Products/Silva/Security.py", line 227, in sec_get_member return self.service_members.get_cached_member(userid) File "/zope/intranet/Products/Silva/SimpleMembership.py", line 193, in get_cached_member return cloneMember(self.get_member(userid)).__of__(self) File "/zope/intranet/Products/Silva/SimpleMembership.py", line 187, in get_member return member File "/zope/intranet/Products/Silva/SimpleMembership.py", line 174, in is_user return userid in self.get_valid_userids() File "/usr/local/Zope-2.10.5/lib/python/AccessControl/Role.py", line 360, in get_valid_userids return tuple(keys) File "/usr/local/Zope-2.10.5/lib/python/AccessControl/User.py", line 984, in user_names return self.getUserNames() File "/zope/intranet/Products/LDAPUserFolder/LDAPUserFolder.py", line 576, in getUserNames return self._misc_cache().get('loginlist') File "/zope/intranet/Products/LDAPUserFolder/LDAPUserFolder.py", line 1846, in _misc_cache return getResource('%s-misc_cache' % self._hash, SharedObject, ()) File "/zope/intranet/Products/LDAPUserFolder/SharedResource.py", line 50, in getResource _ResourceLock.release() I am not sure if all the problems are caused by the LDAPUserfolder. Regards Nico
Nico Grubert wrote at 2008-10-9 10:17 +0200:
...
If your system gets slow but still occasionally responds, you can use "ZopeProfiler" to find out, where the time is spend. Ensure to use "real time" as the timer (not cpu time).
Hi Dieter
thank you very much for the information.
This morning the server did not respond and all 8 threads in the DeadlockDebugger's output look the same:
Thread 1098918208 ... File "/zope/intranet/Products/LDAPUserFolder/LDAPUserFolder.py", line 1846, in _misc_cache return getResource('%s-misc_cache' % self._hash, SharedObject, ()) File "/zope/intranet/Products/LDAPUserFolder/SharedResource.py", line 50, in getResource _ResourceLock.release()
Your tracebacks really look strange: they almost all seem to come from the release of looks. Now, releasing a lock is a very fast operation. An independent observer should rarely see the system in such a state. That you see it there so often indicates that your observer ("DeadlockDebugger") is not independent (we already knew that) but in addition gets seriously confused -- thus, that we cannot trust its observations when analysing the problem at hand. You need to use a different observer. As already written: if your Zope does not hang completely, "ZopeProfiler" may be such an observer. It, too, is not independent, but different things confuse it. Thus, we some luck, it may reveal the cause of the current problems. If your Zope hangs completely, then things get a bit more difficult. You must attach the Zope process with a C level debugger. This observer will not get confused by the Python GIL and is able to observe threads that hold the GIL during long periods of time. Unfortunately, quite a bit of preconditions must be met and the analysis requires lots of knowledge about Python internas. Therefore, I hope that you do not need an observer of this kind. -- Dieter
participants (3)
-
Dieter Maurer -
Nico Grubert -
Peter Bengtsson