[Zope] replication (was Zope: 5.4, jboss 0.3 million hits with google)
Toby Dickenson
tdickenson@geminidataloggers.com
Thu, 13 Feb 2003 11:18:42 +0000
On Thursday 13 February 2003 1:53 am, Paul Winkler wrote:
> > how can we replicate with DirStorage? would you mind doing a little
> > brain dump? with rsync?
rsync has to stat every inode in the directory. That sucks.
Version 1.0 has a whatsnew.py script that uses the normal undo log information
to work out what files have changed since a historic transaction id. This is
used by the incremental backup tool in version 1.0 and the replication script
in 1.1, and makes them maximally efficient in I/O terms.
(just remember to keep enough history when packing to cover your
backup/replication interval)
> I am not really the person to ask, as I haven't actually done it.
But you know you want to ;-)
> But here's the official way to do it as of version 1.1:
> http://dirstorage.sourceforge.net/replica.html
That document has been updated in the last week, it now has a more detailed
howto. Essentially, on the replica machine run:
"replica.py masterzeohost:/var/master /var/replica"
and it should "just work"
> I would naively assume that you *should* be able to replicate by
> 1) putting the "master" into snapshot mode
> 2) running rsync
> 3) taking the "master" out of snapshot mode
>
> ... but there may be hidden issues with that; I would kind of
> assume so, since Toby Dickenson bothered to write the replication
> tool. Toby, are you reading this? Care to comment?
That will kinda work, apart from the performance issues mentioned above. Take
care over locking on the replica; you dont want replication to restart when
the master comes back up after an outage, with the storage still running on
the slave.
The big problem with this is that rsync is not atomic. If the master explodes
half way through an rsync then the replica may contain half of the most
recent transaction.
1.1 might still be in alpha, but I am sure it is more stable than anything
based on rsync. As always, I am already using it in production. Replicating
once per minute and performing a full check on the replica storage once per
hour. It is looking good so far.
--
Toby Dickenson
http://www.geminidataloggers.com/people/tdickenson