[Zope] large installations and conflict errors
Andrew Langmead
alangmead at boston.com
Mon Aug 8 20:03:36 EDT 2005
On Aug 8, 2005, at 10:01 AM, M. Krainer wrote:
> So far our story, but what I really wonder is, if there's anyone out
> there who has a similar large installation. Please let me know how
> large your zope instance is and what you have done to increase your
> (write) performance. Also any ideas that may help us are welcome.
We have a ZODB that packs down to about 30 gigs. The unpacked size
grows about 10 gigs a week, which shows that there is a lot of write
activity in our environment too. We have three Zope instances as ZEO
clients (1.4Gig PIII, with about two gigs of RAM) A load balancer in
front of those machines is set to favor certain URL prefixes towards
the same machine. This someone unbalanced set up for the load
balancer improves the chance that the ZEO client cache will have the
appropriate object and avoid accessing the ZODB for it. (on these
machines, the ZEO client cache is set to 2GB and a cache flip occurs
maybe twice a week.)
Three other machines are handling purely non-interactive tasks
(either through wget or through the Scheduler product) If possible,
these machines are set up with a single zope thread and a large
memory cache. (instead of the standard setup with four threads of xMB
each, it is one thread of x*4MB.) Not only does this help with the
speed of a request, but prevents each threads private object cache
from having duplicate copies of the same object. (these machines also
have a 2GB ZEO client cache, but flip daily)
A ZCatalog has a index that is single large Zope object, loosing it
from cache will cause a lot of pain when you need it again. Although
we don't use QueueCatalog, I can see the advantage of having it
concentrate a lot of catalog work in a single thread and transaction.
Zope's opportunistic transactions are assuming that a request will
complete relatively quickly, and that the likelihood of two entirely
separate requests accessing the same object is slim. I like to think
of it as the assumption that it is hard for two lightning bolts to
hit the same place at the same time. The two ways you can run afoul
to this assumption is to either have one object whose modification is
greatly favored over others, or requests taking much longer than
average.
I've had to investigate object hotspots before, and what I've found
useful is fsdump.py on an unpacked version of the database.
fsdump.py var/storage/var/Data.fs|sed -n 's/.*data #[0-9]*//p'|sort|
uniq -c|sort -n
then finding particular oids that occur in the fsdump log much more
frequently than the rest. Once you've found the hot objects, you can
looj back through the fsdump.py log to find the transactions that
they belong to and the URL associated with them. Once you've found
the code paths that are all modifying the same object, then the
changes that need to be done to make the object less hot are
application specific.
For requests that are taking so long that they are starting to
interfere with other requests, they might be able to be found with
requestprofiler.py and the ZopeProfiler product. Once they are found,
standard code optimization techniques are needed to reduce them.
That's about all I can think of writing at the moment, but if you
have anything you want to ask me, give me a yell.
More information about the Zope
mailing list