Hello,
I am working on a solution that has a very high number of
users and a significant amount of traffic. Using a zeo configuration we
have been running into a few bottle necks while trying to improve our load
testing results. The problem we see is if that put our Zeo Configuration
under load that Zope does not close the connections and the application server
ends up running out of connections for Apache to connect to. The
connections once opened idle indefinitely. (Apache is running on a
dedicated server different from our Zeo instances). We are using RR
between 2 vm instances using 10 zeo clients per vm to distribute load. The
bottleneck is occurring on the Apache server because it's keeping TCP
connections in time_wait status.
We have completed load tests on both Windows 2003 and
2008. In 2003 we were able to adjust the registry so the OS would terminate
connections after 15 seconds of idle, however in 2008 the minimum is 30
seconds. In the upper levels of testing the OS runs out of TCP
connections because it can’t close them fast enough and it begins to fail
requests. Ideally we don’t want to close connections forcibly via the
tcp stack because Zope keeps them open. We’d hope that Zope would
manage this clean up gracefully.
One option we are considering is using IIS7.5 and ARR as a
replacement load balancing/rewrite method. This could allow us to check
health of destinations prior to forwarding a request. It may also give us
more control over closing connections at the OS level.
One other detail we think might be the issue is that Zope is
not initiating the close connection event.
Does anyone have any experience or knowledge they can lend
to help out?
Configuration:
Windows 2003/2008 server
Apache 2.1
Zope 2.12
MS SQL 2005
Python 2.6.6
SQLAlchemy 0.6.5
sqlalchemy
version is 0.6.5
and
z3c.sqlalchemy=1.4.0
zope.sqlalchemy=0.6
Jimmy Small (mallaice)