[Zope] System performance threads/proccesses & random crashes (SIGPIPE)

Chris McDonough chrism@zope.com
Thu, 21 Mar 2002 11:07:53 -0500


SIGPIPE is raised by the OS when a UNIX pipe is broken in the application.
UNIX takes this exception seriously which is why it sends the signal to t=
he
process telling it "you've got a broken pipe".

As you say it started happening when you began using the database adapter=
,
it may be that some piece of the database adapter opens a pipe that is la=
ter
broken (for whatever reason, that's the $10,000 question ;-), causing the=
 OS
to send Zope a SIGPIPE.

It may be possible to install a signal handler for SIGPIPE to get rid of =
the
problem, but I'm not exactly sure what it should/would do during this
failure state, and it would be more useful to try to pin down the pipe th=
at
is getting broken by making the problem replicable.

The ZODB pool_size parameter is controlled via the pool_size argument to
ZODB.DB.DB's constructor.  It signifies how many database connections its
willing to place in the pool.  When Zope starts up, each Zope thread need=
s
to use its own database connection.  So you should likely never have a
smaller pool_size than number of threads (the -t parameter to z2.py).
Adjusting these values up and down may improve performance but there has =
to
this day not been any empirical studies as to how performance is impacted
when you do. It's probably something you need to try out in a load testin=
g
environment.  If you find something interesting, let us know! ;-)

----- Original Message -----
From: "Doyon, Jean-Francois" <Jean-Francois.Doyon@CCRS.NRCan.gc.ca>
To: <zope@zope.org>
Sent: Thursday, March 21, 2002 9:57 AM
Subject: [Zope] System performance threads/proccesses & random crashes
(SIGPIPE)


Hello,

I'm running into random crashes of my zope processes, but I'm not finding
any reference anywhere in the mailing list archives or on the site about
this specific one:

I'm getting:

2002-03-21T14:48:52 ERROR(200) zdaemon zdaemon: Thu Mar 21 09:48:52 2002:
Aiieee! 20070 exited with error code: 13

Every now and then, for now apparent reason.  signal 13 is a SIGPIPE ...

This is Zope 2.5.0 with CMF 1.2 on a severly upgraded/updated/patched RH6=
.2
... with a Python 2.1.2 built with defaults. It runs with FastCGI to Apac=
he
1.3.2x ...

Usually I just wait a couple of seconds, hit referesh in my browser and
things come back to normal, but it's still annoying, and doesn't look goo=
d
to the public.  Note that when this happens, it ususally seems to happen =
to
ALL processes.  It looks to me like the PIPE's between the master zope
process and it's children dies, and they all have to restart for some
reason. Could this be ? and if so  , why ?

Note that I started noticing this when I for the first time started using
Psycopg to create RDBMS connections to my PostgreSQL ... Could there be a
relation somehow?

On a slightly similar topic, How to I manage performance? I plan on using
Zope for a fairly high demand web site .. I noticed I can control how man=
y
processes/threads start, but then I also read somethign about the ZODB
pool_size ... What is the relation between the two exactly ?

Thank you,

Jean-Fran=E7ois Doyon
Internet Service Development and Systems Support
GeoAccess Division
Canadian Center for Remote Sensing
Natural Resources Canada
http://atlas.gc.ca
Phone: (613) 992-4902
Fax: (613) 947-2410


_______________________________________________
Zope maillist  -  Zope@zope.org
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists -
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )