[Zope] Urgent help needed: Zope falls over under moderate load
Michael Fraase
mfraase@farces.com
Tue, 20 Nov 2001 15:39:26 -0600
Okay, Chris, that's a lot to chew on. Thanks very much for your patience
and time.
--
Michael Fraase
ARTS & FARCES LLC
mfraase@farces.com
www.farces.com
PGP Fingerprint:
3D85 F3F4 9E65 4949 176A 260C CB47 190D C864 9A96
> -----Original Message-----
> From: Chris McDonough [mailto:chrism@zope.com]
> Sent: Tuesday, November 20, 2001 3:39 PM
> To: mfraase@farces.com; 'Chris Withers'
> Cc: zope@zope.org
> Subject: Re: [Zope] Urgent help needed: Zope falls over under
> moderate load
>
>
> > I guess I'm confused. Everything that *could* be cached *was*
> cached.
> > And no, I don't run a caching server or a proxy server or anything
> else
> > in front of Zope. I'm a writer, not a programmer.
>
> OK, fair enough.
>
> But your profession still doesn't absolve you from needing to
> cache more in order to survive a Slashdotting. ;-) Either
> that or you'll need to start developing your site with static
> pages only. That'd work too.
>
> > The /. piece hit about 1:00 AM. By 1:01 AM Zope had folded like a
> cheap
> > suit. It's still going down about every 40 minutes or so.
> >
> > Now remember, my outbound bandwidth is limited to 512Kb.
>
> If 512Kb/s is hit by as many 300-byte requests per minute as
> possible, this translates into without taking into account
> latency or response usage a potential inbound rate of 213
> requests per second. That's still a lot of requests. As
> something to measure that up against at peak normal load,
> Slashdot gets about 180 requests/sec. The 512Kb/s isn't much
> of a throttle.
>
> And this is assuming that your inbound bandwidth is limited
> to 512Kb/s.. you only mentioned your outbound in this mail.
> If inbound is higher, it's even more of a problem.
>
> > Am I correct in my understanding that Zope can't handle even 512Kb
> of
> > demand without some technical doohickey in front of it so
> it doesn't
> > fall down?
>
> Your pipe is fat enough to allow lots of requests in, and
> what you're serving is probably sufficiently complex to be
> very slow. Squishdot is really not known for its speed.
>
> "Raw" Zope itself could almost certainly handle it, however,
> if what you were returning is a DTML method that said
> "<html>this is a simple page</html>". But this isn't what
> you're returning; Squishdot has a big say in what shows up.
>
> > No offense intended, but I think two internal Squishdot pages meet
> the
> > definition of pretty dang simple.
>
> Maybe conceptually it's simple, but apps like Squishdot do
> lots of stuff in order to generate these pages. For fun, you
> should try to set up a "barebones" Squishot with the default
> homepage, and hit it repeatedly with a load-generator like
> Apache's "ab". Then try the same thing with a Zope page that
> is "<html>Hello!</html>". You will see a big difference. On
> an 850Mhz box at ZC, I can get Zope to serve about 152
> requests/s with the simple page.
>
> Anybody want to try this with an out of the box Squishdot
> homepage? Or a Squishdot story page? The guy from the KDE dot
> (http://dot.kde.org) claimed he could only get about 2
> requests/second out of a Squishdot home page. After setting
> up caching properly, he was able to get about 2000.
>
> > And why does it fall over anyway? This just doesn't make any sense
> to
> > me. I can see it getting slow and timing out, but giving up
> completely
> > and just bailing? What's that about? Explain it to me like I'm an
> > intelligent, non-technical friend. Thanks.
>
> The big "bang for buck" solution provider is caching.
> Assuming that you had no problems *before* the slashdotting,
> that will solve your problem because it will cause Zope to
> need to serve far fewer requests, closer to the number of
> requests you normally get. And this is (I assume) the
> outcome that you actually want. I highly recommend setting
> up a caching proxy in front of Zope if this sort of load will
> be recurring. It's way faster and cheaper than trying to
> understand the problem deeply. ;-) Most commercial sites
> are developed using this principle, AFAICT.
>
> But if you're as interested in understanding the phenomena as
> you are in solving the problem and you'd like to help the
> current Squishdot maintainer and ZC improve their products'
> behavior under load, it'd be necessary to know more details
> about how it was failing under load and what happened during
> the failures. I would be interested in these results. It
> could be a memory leak, it could be a Zope bug, a Squishdot
> bug, it could be just about anything. You need forensic
> information and you need to let it fail under load in order to get it.
>
> Usually, you can get this info by turning on "big M" logging
> (by passing "-M detailed.log" at the end of your start.bat
> script, maybe). On Linux, I'd recommend also using the
> ForensicLogger product (see
> http://www.zope.org/Members/mcdonc) to gather more details
> such as memory utilization and CPU utilization; it doesn't
> work on Windows, however. If you're willing to do this, let
> it fail under load, then send the log with the failure in it
> to me and I will try to analyze it.
>
> Note that you *might* be able to make use of the AutoLance
> product at http://www.zope.org/Members/mcdonc to autorestart
> your machine for you if you've got a memory leak.
>
> HTH,
>
> - C
>
>
>