[ZODB-Dev] Horizon for highly-available ZODB storage?
sean.upton@uniontrib.com
sean.upton@uniontrib.com
Thu, 03 Jan 2002 12:12:43 -0800
Happy New Year, all!
I wanted to query this list to get some input on what options are likely to
be available for maintaining a highly-available ZEO storage server within
the next 6-7 months. My company has a very large, highly-demanding project
we are just getting started on that will heavily utilize multiple ZODB
instances on a ZEO cluster with a 2-box ZSS strategy. We currently do this
for some less-demanding projects and attain high-availability across 2
concurrently running ZSS nodes using the Linux-HA project's Heartbeat
clustering software; our means of replication of our FileStorage Data.fs
from ZSS #1 to ZSS #2 is a simple daily file transfer (FTP) and restart of
the ZSS process; this is ok, since this data is updated usually just once
daily. But our upcoming projects will involve bigger ODBs that update far
more often, so I wanted to get some input on what options are both available
now, and also what new strategies will likely be available later in the
upcoming year.
Specifically, I am wondering about three items in regards to a 2-box
cluster, with a primary ZSS and a hot-backup:
1 - Toby Dickenson's Replicated FileStorage (Available Now)
http://www.zope.org/Members/htrd/ReplicatedFileStorage
2 - Standby Storage (Project Status?)
http://www.zope.org/Wikis/ZODB/StandbyStorage
3 - DirectoryTreeStorage (Proposal) + InterMezzo FS (I'm dreaming, aren't
I?)
http://dev.zope.org/Wikis/DevSite/Proposals/DirectoryTreeStorage
http://www.inter-mezzo.org/
My hunch would be that DirectoryTreeStorage could
be designed with Intermezzo in mind for decent,
simple 1-way replication... in theory, of course.
I particularly like the IDEA of the 3rd (and most vaporous option), and have
the feeling that it could work, provided you clustering software restarted
the ZSS process, given problems with a few pickle files in a
DirectoryTreeStorage caused by an incomplete replication of files by
Intermezzo due to a machine fault would still be handled ok, at least if I
understand the implications of ChrisM's proposal: "If it finds evidence of a
failed transaction, it will revert any files it needs to within the
directory to their pre-transaction state by using the data in the log."
Of course I'm grounded in the reality that I need to eventually deploy a
solution in a production environment, but I'm interested in hearing some
thoughts, and perhaps sparking some discussion on this issue, as I imagine I
am not alone in the need for a solution.
Thanks,
Sean
=========================
Sean Upton
Site Technology Supervisor
Development & Integration
SignOnSanDiego.com
The San Diego Union-Tribune
619.718.5241
sean.upton@uniontrib.com
=========================