[Zope-dev] Zope Server hanging :-(

Chris McDonough chrism@digicool.com
Wed, 2 May 2001 12:08:19 -0400


The docs (available via the --help switch) go into some of this detail, but
I agree that a narrative explaining how to approach it from a functional
perspective would be a good thing.


----- Original Message -----
From: "Tim McLaughlin" <tim@BCSWebservices.net>
To: "'Chris McDonough'" <chrism@digicool.com>; "Chris Withers"
<chrisw@nipltd.com>
Cc: "Tim McLaughlin" <tim@BCSWebservices.net>; <zope-dev@zope.org>
Sent: Wednesday, May 02, 2001 12:05 PM
Subject: RE: [Zope-dev] Zope Server hanging :-(


> Nice info.  This could be useful for many, many things.  you oughtta put
> this in a how-to (if you haven't already).  thanks Chris.
>
> I'll use it if we have any more issues.  Currently, as I told ChrisW, the
> issues have stopped. But I'm still keeping my eye on it.
>
> -----Original Message-----
> From: Chris McDonough [mailto:chrism@digicool.com]
> Sent: Wednesday, May 02, 2001 12:00 PM
> To: Chris Withers
> Cc: Tim McLaughlin; zope-dev@zope.org
> Subject: Re: [Zope-dev] Zope Server hanging :-(
>
>
> One of the more important bits is the "active" count when using
> the --detailed option to requestprofiler.  This tells you how many *other*
> requests were "unfinished" at the end of a particular request.
>
> So for instance, if you run requestprofiler like this:
>
>   python requestprofiler.py log.file.name --detailed
>
> And you see a profile line like (header added for reference):
>
> start                win wout wend tot S   osize  code act url
> 2001-04-19T01:19:42    0   12    0  12 E   26024  200    5
> /Catalog/checkrep?P
>
> This means that:
>
>  - this request started at 2001-04-19 1:19:42 UTC ("start")
>
>  - zserver waited 0 secs for input from the client ("win")
>
>  - zserver waited 12 secs for output from the publisher ("wend")
>
>  - the total time from start to end for the request was 12 secs ("tot")
>
>  - the request reached the end ("E") stage ("S", one of "E", "A", "I",
"B")
>
>  - the output size in bytes was 26024 ("osize")
>
>  - the http code was 200 ("code")
>
>  - the number of pending requests at the end of this request was 5
> ("active")
>
>  - the URL of the request ("url")
>
> Hangs are also interesting.
>
> Use detailed reporting to sort by end stage:
>
>   requestprofiler.py your.log.file --detailed --sort=endstage
>
> Any requests that don't show a stage of "E" for their endstage were "hung"
> requests.
>
> You can also use cumulative reporting to sort by "hangs":
>
>   requestprofiler.py your.log.file --cumulative --sort=hangs
>
> This will show the methods that "hung" most frequently.  Much of the time,
> you can find problems this way by seeing methods that enter an infinte
loop
> or cause a segfault.
>
> Total time is also interesting:
>
>   requestprofiler.py your.log.file --detailed --sort=total
>
>   Shows you individual calls to methods that took a long time.
>
>   - AND -
>
>   requestprofiler.py your.log.file --cumulative --sort=total
>
>    Shows you where most of the request time is spent.
>
> The goal is to indentify methods or requests that take a long time, don't
> return, or are accessed frequently.  If you can match this data up with
> specific problems you've experienced (possibly by way of times/dates), it
> becomes easier to form postulations about what is happening to Zope under
> load.  Then investigate those postulations by refactoring code or by doing
> more detailed logging in the places where you think the problems are.
>
> - C
>
>
>
>
>
> > Hmmm, not really sure what I should be looking for :-S
> >
> > What ways should I look to run it and what should I do with the output?
> >
> > cheers,
> >
> > Chris
> >
>