ZODB/ZSS High Availablity, was: RE: [Zope] Zope Myths?

tomas@fabula.de tomas@fabula.de
Fri, 13 Sep 2002 00:12:51 +0200


On Thu, Sep 12, 2002 at 02:21:28PM -0700, sean.upton@uniontrib.com wrote:
> I have been doing a lot of thinking about odb/storage/zss replication
> lately, but I haven't had a chance to implement these practices yet, so your
> mileage, insights, and opinions may vary from these thoughts...
> 
> If the thing that makes replication hard is constant change of lots of
> interdependant data, a meaningful snapshot system as close to the database
> software as possible (i.e. DirectoryStorage's snapshots, not LVM's) likely
> mitigates that risk by providing reasonable assurance of atomicity.

Yep. The replication system has to know what a transaction is. You might
be able to live with the loss of a (couple of) transactions, but not with
the loss of half a transaction.

>                                                                      If the
> replication process itself has problems part way through transfer (a low
> tech soutions like find+cpio over nfs would),

Rsync. I keep saying rsync is your friend :-)

>                                               it is up to the sysadmin to
> write scripts to:
> 	1 - Keep multiple areas for replication
> 		-> Stage the entire replication in a temp 
> 		   dir before putting it in the place that
> 		   it is used by ZSS software
> 			-> since there is no way to do a 
> 			   transactional file copy of multiple
> 			   files, how about using symlinks, and
> 			   moving the symlink on completion	of a
> 			   full, atomic transfer and completed
> 			   storage consistency check?

Hmmm. The whole problem seems to be to get a copy of your set with no
(or with bearable) data `skew'. But then you must know the innards of
your database (or maybe have a sort of `freeze point' in time akin to
a `meta transaction' checkpoint.

> 	2 - Have clustering software resource takeover scripts
> 	    (i.e. heartbeat resource scripts) evaluate:
> 		a. if the storage it is about to use is good, &
> 		b. if the last transfer failed, use the last
> 		   _good_ full replicated set of files.
> 		c. The above two checks must be done before starting
> 		   the ZSS process on the backup server node.

Sounds quite difficult without having access to the innards of the DB
(I am using the word DB loosely here, more as `data set with some
consistency restrictions', that may be a bunch of files or whatever).

> Mostly, I can't see how shared storage (DAS/SAN) can provide the same
> risk-avoidance levels that could be done with the above practices, unless
> you have some ways of mirroring the last good copy of your odb storage
> within the same shared storage (replication between two places on the same
> storage; I assume snapshots and scrips on the secondary node to check
> consistency of storage/db like 2(a) above could come in handy for this too)?

It boils down to: know thy application -- doesn't it?

Back to Zope -- does anyone know how the prospects for the ZODB are?
Thanks
-- tomas