[Zope] Re: Non-responsive objects reprise
Florent Guillaume
fg at nuxeo.com
Sun Nov 13 14:18:29 EST 2005
You probably have a network problem, all the Zope logs show everything
was completed normally (your points 1 and 3). Your problem may be tied
to packet size or keepalives. A network trace, for instance using
Ethereal, will probably help you more than anything.
Florent
Garth B. wrote:
> Hello everyone, this is from an older thread which I'm resurrecting
> with more information.
>
> Despite Dieter's helpful pointers I'm no closer to solving this
> problem but do have more information about it in case anyone can lend
> a hand.
>
> To quickly recap: Periodically when visiting our zope site, certain
> objects appear not to respond. It's consistently the same objects
> from a Page Template in one folder to an
> image somewhere else. The site is running on Zope 2.8.1, Python 2.3.5
> and sitting behind a VHM and Apache 2.0.46 using the usual ReWrite
> rules. This problem suddenly started
> several months ago with the site having been running smoothly for many
> months prior. This is all on Red Hat Enterprise Linux ES release 3
> (Taroon Update 2). The server is a
> dual processor with 1GB RAM, 300GB of hard disk space, hosted by
> Rackspace. The site is relatively large and reasonably active. Its
> content is largely made up of Page
> Templates with a few supporting python scripts and Script (Python)'s.
> There are also a few ZClass-based objects that offer no real unique
> functionality other than providing an
> interface for the admins to create "News" or "Feature" items. The site
> also utilizes a MySQL database.
>
> I've noticed the following things about this problem:
>
> =================
> 1) DeadlockDebugger shows no problems when one of the objects appears
> not to be responding. Everything appears normal.
>
> 2) I can ALWAYS successfully get to the non-responsive objects by
> bypassing Apache and directly viewing the Zope server's equivalent
> :8080 address.
>
> 3) While tailing the trace.log when an object is siezing through
> Apache, I can see the request come to Zope and go right back out with
> no problem. I think that's what
> this is illustrating:
>
> B -1348776468 2005-11-13T10:46:37 GET
> /VirtualHostBase/http/www.domain.org:80/portal/html/VirtualHostRoot/resources/contact
> I -1348776468 2005-11-13T10:46:37 0
> A -1348776468 2005-11-13T10:46:38 200 14938
> E -1348776468 2005-11-13T10:46:38
>
> 4) Turning on debugging output for Apache shows the following proxy
> errors when trying to access an offending object. I've searched for
> related information about this proxy and
> only found one hit from the ZODB-DEV list from 2004 with no responses.
> The errors:
>
> [Sat Nov 12 00:33:33 2005] [error] [client xx.xx.xx.xx] proxy: error
> reading status line from remote server localhost
> [Sat Nov 12 00:33:33 2005] [error] [client xx.xx.xx.xx] proxy: Error
> reading from remote server returned by /contact
> [Sat Nov 12 00:34:02 2005] [error] [client xx.xx.xx.xx] proxy: error
> reading status line from remote server localhost
> [Sat Nov 12 00:34:02 2005] [error] [client xx.xx.xx.xx] proxy: Error
> reading from remote server returned by /resources/index_html
>
> I removed the client IP. Keep #2 and #3 in mind in the context of this problem.
>
> 5) In case there was something in one of the templates that was
> screwing things up, I methodically removed portions of a page (or its
> inherited template). When the page suddenly started responding
> through Apache I thought I hit paydirt, but then I noticed in one
> instance that all I removed was a block of plain HTML (no METAL/TALES
> statements) and that put me back at square one. I think #2 and #3
> make this point irrelevant, and certain images will get hung up, too.
>
> 6) The server is also running Mailman (using the same Python as Zope).
> It uses a seperate virtual host container in Apache to expose its
> adminstrative interface. One of my co-workers swears that when he
> experiences the siezing, he soon after gets several emails from one of
> the Mailman lists which is supposed to be a once-a-day broadcast-only
> list.
> I think this is more of a coincidence though, and I haven't gotten a
> big enough sample size of occurrences to rely on this report.
>
> 7) Restarting Zope *usually* corrects the problem (on Friday,
> restarting it (several times) didn't help)
>
> 8) Restarting Apache sometimes corrects the problem without needing to
> restart Zope.
>
> 9) On one occasion killing Mailman suddenly made one of the offending
> objects respond for a little then stop.
>
> 10) On the rare occasion we have had to physically reboot the server
> (like on Friday).
>
> 11) After the server was rebooted on Friday, memory usage for Zope
> went from about 3% to 20+% as reported by 'top' over a period of about
> 12 hours. I don't know whether that is indicative of a leak or just
> general memory consumption. Restarting Zope appears to return that
> memory back to the OS. This memory usage is what we normally see for
> this site.
>
> 12) Upgrading from Zope 2.7.6 to 2.8.1 appeared to help for a little
> while, but the problem either came back or never left.
>
> 13) I briefly enabled mod_disk_cache in Apache for this site in case
> Zope was getting too stressed out. It appeared to work wonders, but
> some file objects, like PDFs, would
> periodically be reported as corrupted by Acrobat after being
> downloaded. I assume this was a failure to configure mod_disk_cache
> appropriately, and we've since disabled it (at
> which point Acrobat stopped complaining about corrupted PDFs. The
> siezing problem looked as though it disappeared while mod_disk_caching
> was enabled. Indeed, Watching the Apache and Zope logs showed
> requests more often being fulfilled only by Apache than by Zope.
> Perhaps the proxy problems in #4 is indicative of a loaded Zope that
> needs caching. We are not running ZEO or anything like that. Perhaps
> we should.
> =================
>
> Apologies for the long email but I have no idea what's going on... if
> ANYONE has ANY suggestions or ideas on what else I could investigate
> it would be GREATLY appreciated!
>
> Thank you!
>
> Garth
--
Florent Guillaume, Nuxeo (Paris, France) Director of R&D
+33 1 40 33 71 59 http://nuxeo.com fg at nuxeo.com
More information about the Zope
mailing list