[Zope-CMF] cmf.zope.org is OFFLINE
Tres Seaver
tseaver@zope.com
06 Sep 2002 12:20:05 -0400
On Fri, 2002-09-06 at 11:02, Paul Winkler wrote:
> On Fri, Sep 06, 2002 at 07:41:11AM -0500, Jeffrey_Franks@i-o.com wrote:
> >
> >
> >
> > How do you setup monitoring? This might be useful info for some of us.
>
> Yeah, i'd be interested in other people's strategies...
>
> look up daemontools if you want a way to automatically restart
> a service that's died.
>
> if you just want to see if a remote server is responding,
> a simple shell script can be fine - e.g. use wget to grab a simple test
> page; if it times out (probably use the -T option),
> send yourself email.
> Run that script every so often with cron, and there you go.
A quick thumbnail of the monitoring within the zope.org cluster:
- We have to monitor a minimum of three servers for each distinct
"Zope instance": two app servers (ZEO clients) plus the storage
server.
- In addition, there are "infrastructure" processes (Apache) and
servers (the load balancer).
- The monitoring process uses a simple HTTP GET of a dummy page
to test that the appserver processes are up and not hung.
- The initial problem with the cmf.zope.org site was the result of a
sysadmin mistake: the admin working on packing the www.zope.org
storage (which is up over 9Gb at the moment) accidentally killed the
CMF storage server. For most of yesterday, the appservers were
running purely of their ZEO caches. Finally, they got hosed enough
that they all died.
- Our normal monitoring for the dogbowl had been switched off due to a
ton of what seemed like "false positives" during the packing
attempt, and didn't get re-enabled until y'all pointed out that the
site was down.
Tres.
--
===============================================================
Tres Seaver tseaver@zope.com
Zope Corporation "Zope Dealers" http://www.zope.com