[ZODB-Dev] Re: ZEO Replicated Storage

Jim Fulton jim at zope.com
Thu Jul 1 16:01:20 EDT 2004


Note that I'm following up to zodb-dev

Eugene wrote:
> Hello Rob,
> 
> RP> Andreas' idea will work but it doesn't create a 'zero downtime' 
> RP> environment as the copy and the rsync take some amount of time during
> RP> which other transactions can be applied to the ZODB.
> May be I don't understand you well, but I see here's a problem with a
> lot of extra work for admin in detecting problems in DB.
 > I have this situation, one database has crashed several times.
 > Each time it was restored from backup, and some time later it crashes
 > again. Of cource, this db was tested in all ways using all FS
 > utilities I've found in the Internet and all recipes from zopelabs, or
 > some people. And none of the utilitis found the trouble in DB, all is
 > ok, except in few days db crashes again.

If your data is getting corrupted and utilities like
fstest don't catch the problem, we'd like to get a copy of your
database so we can fix whatever is wrong with fstest that is causing the
problem to go undetected,

ZRS deals with things like server's going down. It doesn't
directly deal with corruption because, frankly, that hasn't
been a problem for us or our customers. :)

It does deal indirectly with corruption because, AFAIK, corruption
is generally caused by hardware or system failures and ZRS lets you use
multiple systems.

A significant difference between ZRS and rsync is that it replicates
at the transaction level, not at the file level.  If your file gets
corrupted, rsync, or any other backup mechanism will happily duplicate
the corruption.  ZRS, on the other hand, independently applies transactions
to each replicated storage.

> So I cannot be sure my server works well, and there's no potential
> problems in DB. Persistent online monitoring is not the case,
> especially I cannot catch the moment when error gets to my DB. How to
> find is there error backup from yesterday or before yesterday?
> 
> RP> Zope Replication Services (ZRS) minimizes cluster downtime while 
> RP> maximizing the transactional integrity of the ZODB.  Downtime is 
> RP> limited to the time necessary to detect failure and transition to the
> RP> secondary ZRS storage server.  Transactions are saved on the primary
> RP> storage server *and* sent into the ZRS cloud for storage on some number
> RP> of secondary servers.
> I'm looking for utility, which can detect error automatically.
> If some write operation made error, I want to find it when error
> appear, but not when my bd got inoperable.

We are pretty sure that write operations don't cause the corruption.
The only way to detect the corruption is to periodically re-read
the data.  If you are having frequent corruption problems, I suspect
you have a system problem.

> My interest to ZRS is based on it's ability to detect troubles
> automatically and turn off bad databases. Something like mirroring
> RAID array.

It doesn't detect problems.  Rather, it provides a warm backup if problems
occur. Importantly for you, if corruption occurs in one database,
it's extremely unlikely that replicated databases will be corrupted,
so recovery is very fast.

Jim

-- 
Jim Fulton           mailto:jim at zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org


More information about the ZODB-Dev mailing list