Hello, We have several different zope instances running on our servers, and currently the only backup we do is a daily copy of the Data.fs. The main problem with this backup approach is, that the zope instance is not stopped for the backup at all, thus the Data.fs might not be up to date. Is it safe in general to copy Data.fs while the instance is running? Or might I end up with a broken Data.fs, which zope is not able to read at all? And which other ways to you know to backup an entire instance? Maybe sync it with ZEO to another server in realtime? Or use some export method? thanks in advance, jonas
----- Original Message ----- From: "Jonas Meurer" <jonas@freesources.org> To: "zope-users" <zope@zope.org> Sent: Friday, October 26, 2007 10:57 AM Subject: [Zope] backup full instances
Hello,
We have several different zope instances running on our servers, and currently the only backup we do is a daily copy of the Data.fs.
The main problem with this backup approach is, that the zope instance is not stopped for the backup at all, thus the Data.fs might not be up to date.
Is it safe in general to copy Data.fs while the instance is running? Or might I end up with a broken Data.fs, which zope is not able to read at all?
And which other ways to you know to backup an entire instance? Maybe sync it with ZEO to another server in realtime? Or use some export method?
Have a look at: http://wiki.zope.org/ZODB/FileStorageBackup Jonathan
--On 26. Oktober 2007 16:57:02 +0200 Jonas Meurer <jonas@freesources.org> wrote:
Is it safe in general to copy Data.fs while the instance is running? Or might I end up with a broken Data.fs, which zope is not able to read at all?
If you create a local copy, it is safe. If you rsync a running Data.fs, it is not safe. In this case you create a local copy and rsync the copy.
And which other ways to you know to backup an entire instance? Maybe sync it with ZEO to another server in realtime? Or use some export method?
<http://www.zope.com/products/zope_replication_services.html> Check also for "zeoraid" from Gocept. For incremental backups: check the repozo.py script. -aj
On 26/10/2007 Andreas Jung wrote:
Is it safe in general to copy Data.fs while the instance is running? Or might I end up with a broken Data.fs, which zope is not able to read at all?
If you create a local copy, it is safe. If you rsync a running Data.fs, it is not safe. In this case you create a local copy and rsync the copy.
what's the difference between local copy and rsync except for the time it takes? but if rsync is unsafe only because it takes long time, and changes on the instance during the copy process could cause a corrupted Data.fs to be backuped, then theoretically this could happen for a local copy as well, right? only that the probability would be far smaller.
For incremental backups: check the repozo.py script.
That one look interesting. thanks for the hint! greetings, jonas
--On 27. Oktober 2007 16:29:18 +0200 Jonas Meurer <jonas@freesources.org> wrote:
On 26/10/2007 Andreas Jung wrote:
Is it safe in general to copy Data.fs while the instance is running? Or might I end up with a broken Data.fs, which zope is not able to read at all?
If you create a local copy, it is safe. If you rsync a running Data.fs, it is not safe. In this case you create a local copy and rsync the copy.
what's the difference between local copy and rsync except for the time it takes? but if rsync is unsafe only because it takes long time, and changes on the instance during the copy process could cause a corrupted Data.fs to be backuped, then theoretically this could happen for a local copy as well, right?
Think twice about your last sentence. What should cause a local *copy* to be changed?? -aj
On 27/10/2007 Andreas Jung wrote:
If you create a local copy, it is safe. If you rsync a running Data.fs, it is not safe. In this case you create a local copy and rsync the copy.
but if rsync is unsafe only because it takes long time, and changes on the instance during the copy process could cause a corrupted Data.fs to be backuped, then theoretically this could happen for a local copy as well, right?
Think twice about your last sentence. What should cause a local *copy* to be changed??
zope might write to the Data.fs while it is copied, thus an inconsistent copy would be backuped, even inside one filesystem. i've asked in #debian on freenode as i wasn't sure, here's the log: < mejo> if i copy a large file inside a mounted filesystem (ext3), is it possible that the file is changed during the copy process? < Wyzard> mejo: Yes, it's possible that something else can write to the file while you're copying it < mejo> because i asked on the zope-users mailinglist if i could backup the global Data.fs (database) while the daemon is running, and someone answered: < mejo> If you create a local copy, it is safe. If you rsync a running Data.fs it is not safe. In this case you create a local copy and rsync the copy. < Wyzard> mejo: Making a local copy is faster, so it'd be safer, but still not completely safe < mejo> exactly, that's what i thought as well. < mejo> but when i wrote that, he replied: < mejo> Think twice about your last sentence. What should cause a local *copy* to be changed?? < mejo> so he's wrong? < Wyzard> I'd say he's wrong < Wyzard> A local copy isn't instantaneous, and Zope changes the file while it's being read < wols_> he is wrong yes < mejo> thanks. is it ok for you if i quote you in my reply mail? < Wyzard> mejo: sure < wols_> mejo: while copying zope could change the database and create an inconsistent state greetings, jonas
On 10/27/07, Jonas Meurer <jonas@freesources.org> wrote:
zope might write to the Data.fs while it is copied, thus an inconsistent copy would be backuped, even inside one filesystem.
if you insist on copying a file, _and_ happen to be on linux with some lvm volumes, you can snapshot that volume with Data.fs on it (throw perhaps a sync before, don't know if it makes a difference). So you get the copy at one point in time. you can also rsync from that. --knitti
--On 27. Oktober 2007 18:18:23 +0200 Jonas Meurer <jonas@freesources.org> wrote:
On 27/10/2007 Andreas Jung wrote:
If you create a local copy, it is safe. If you rsync a running Data.fs, it is not safe. In this case you create a local copy and rsync the copy.
but if rsync is unsafe only because it takes long time, and changes on the instance during the copy process could cause a corrupted Data.fs to be backuped, then theoretically this could happen for a local copy as well, right?
Think twice about your last sentence. What should cause a local *copy* to be changed??
zope might write to the Data.fs while it is copied, thus an inconsistent copy would be backuped, even inside one filesystem.
This is not the point. rsync will run into if the file to be synced changes in-between the sync operation. This will happen with the Data.fs - it won't happen with a static copy. An inconsistent copy of the Data.fs is not the problem since invalid transaction entries will be discarded by the ZODB. So rsyncing a copy of the Data.fs it the way to go. -aj
On 27/10/2007 Andreas Jung wrote:
This is not the point. rsync will run into if the file to be synced changes in-between the sync operation. This will happen with the Data.fs - it won't happen with a static copy. An inconsistent copy of the Data.fs is not the problem since invalid transaction entries will be discarded by the ZODB. So rsyncing a copy of the Data.fs it the way to go.
Ah, now i got it. Thanks for your patience ;-) it's interesting that the current backups (rsync the Data.fs directly) never caused any problems. i guess that's due to the fact that the Data.fs infact never changed during the backups yet. i'll change that to backup a local copy of the Data.fs instead in future. a simple solution would be to run 'cp Data.fs Data.fs.safe' for every instance just before backuppc starts the rsync process. thanks for your help, jonas
On Sun, 28 Oct 2007 20:15:27 +0100, Jonas Meurer wrote:
i'll change that to backup a local copy of the Data.fs instead in future. a simple solution would be to run 'cp Data.fs Data.fs.safe' for every instance just before backuppc starts the rsync process.
We use the repozo backup (and restore) tool that comes with Zope and include the backup repository in our normal backup regime (which uses backuppc). <zopeinstalldir>/bin/repozo.py Read inside the file for the doco. This tool is designed to work with live Data.fs files. Apart from some sort of live replication, it is the only way to go as far as I'm concerned, unless you are unconcerned about the possibility of corrupted backups. Cheers, Sam.
On 28/10/2007 Sam Stainsby wrote:
We use the repozo backup (and restore) tool that comes with Zope and include the backup repository in our normal backup regime (which uses backuppc).
<zopeinstalldir>/bin/repozo.py
Read inside the file for the doco. This tool is designed to work with live Data.fs files. Apart from some sort of live replication, it is the only way to go as far as I'm concerned, unless you are unconcerned about the possibility of corrupted backups.
Hey,
From what others said, a local copy of the 'live Data.fs' is safe as well, as zope appends new transactions to the end of Data.fs.
To quote Andreas Jung: "An inconsistent copy of the Data.fs is not the problem since invalid transaction entries will be discarded by the ZODB." greetings, jonas
On Mon, 29 Oct 2007 15:15:51 +0100, Jonas Meurer wrote:
From what others said, a local copy of the 'live Data.fs' is safe as well, as zope appends new transactions to the end of Data.fs.
To quote Andreas Jung:
"An inconsistent copy of the Data.fs is not the problem since invalid transaction entries will be discarded by the ZODB."
That's the theory. But why trust a higher risk strategy, when there is a perfectly good tool for doing full and incremental backup that produces a compact and timestamped backup files as a coherent set of files in a backup repository? Not using it is just laziness. What happens, for instance, if by coincidence the database is being packed by some automated script at the same time as your automated backup occurs? Cheers, Sam.
On 29/10/2007 Sam Stainsby wrote:
To quote Andreas Jung:
"An inconsistent copy of the Data.fs is not the problem since invalid transaction entries will be discarded by the ZODB."
That's the theory. But why trust a higher risk strategy, when there is a perfectly good tool for doing full and incremental backup that produces a compact and timestamped backup files as a coherent set of files in a backup repository? Not using it is just laziness. What happens, for instance, if by coincidence the database is being packed by some automated script at the same time as your automated backup occurs?
You're right. I just have to find a way to implement it in the backuppc backups. It would be awefull if the 'restore' function from backuppc would still work for Data.fs directly, but that's only possible if the backupped Data.fs from repozo.py is stored as instance/<name>/var/Data.fs in backuppc. I doubt that this is possible in an easy way. I'll see. But for sure something has to be changed ;-) ... jonas
Hi, Jonas Meurer schrieb:
On 27/10/2007 Andreas Jung wrote:
If you create a local copy, it is safe. If you rsync a running Data.fs, it is not safe. In this case you create a local copy and rsync the copy. but if rsync is unsafe only because it takes long time, and changes on the instance during the copy process could cause a corrupted Data.fs to be backuped, then theoretically this could happen for a local copy as well, right? Think twice about your last sentence. What should cause a local *copy* to be changed??
zope might write to the Data.fs while it is copied, thus an inconsistent copy would be backuped, even inside one filesystem.
i've asked in #debian on freenode as i wasn't sure, here's the log:
< mejo> if i copy a large file inside a mounted filesystem (ext3), is it possible that the file is changed during the copy process? < Wyzard> mejo: Yes, it's possible that something else can write to the file while you're copying it < mejo> because i asked on the zope-users mailinglist if i could backup the global Data.fs (database) while the daemon is running, and someone answered: < mejo> If you create a local copy, it is safe. If you rsync a running Data.fs it is not safe. In this case you create a local copy and rsync the copy. < Wyzard> mejo: Making a local copy is faster, so it'd be safer, but still not completely safe < mejo> exactly, that's what i thought as well. < mejo> but when i wrote that, he replied: < mejo> Think twice about your last sentence. What should cause a local *copy* to be changed?? < mejo> so he's wrong? < Wyzard> I'd say he's wrong < Wyzard> A local copy isn't instantaneous, and Zope changes the file while it's being read < wols_> he is wrong yes < mejo> thanks. is it ok for you if i quote you in my reply mail? < Wyzard> mejo: sure < wols_> mejo: while copying zope could change the database and create an inconsistent state
Nobody is wrong. Your #debian guys just had not all informations. Zope is always appending to Data.fs, copy (cp) works linear, so it would always maintain a consistent state of this file. (It either copies before or after the last append operation) rsync on the other hand is very efficient by calculating only differences of the file contents to be copied. This may or may be not following the order of blocks in the file. In the latter case it could try to sync wrong information. (You would need a special rsync which would only transfer new blocks at the end in their given order - you could script something like this using dd, gzip/zcat, ssh. Regards Tino
On 27/10/2007 Tino Wildenhain wrote:
Nobody is wrong. Your #debian guys just had not all informations. Zope is always appending to Data.fs, copy (cp) works linear, so it would always maintain a consistent state of this file. (It either copies before or after the last append operation)
rsync on the other hand is very efficient by calculating only differences of the file contents to be copied. This may or may be not following the order of blocks in the file. In the latter case it could try to sync wrong information. (You would need a special rsync which would only transfer new blocks at the end in their given order - you could script something like this using dd, gzip/zcat, ssh.
that finally made the difference between rsync and local copy clear to me ;-) i'll change the backups to rsync a local copy in the future. thanks, jonas
(Fri, Oct 26, 2007 at 04:57:02PM +0200) Jonas Meurer wrote/schrieb/egrapse:
And which other ways to you know to backup an entire instance? Maybe sync it with ZEO to another server in realtime? Or use some export method?
repozo.py Also don't forget to backup all the products and zope.conf once in a while. Regards, Sascha
Jonas Meurer wrote at 2007-10-26 16:57 +0200:
... The main problem with this backup approach is, that the zope instance is not stopped for the backup at all, thus the Data.fs might not be up to date.
Why do you think that this is a problem? Of course, your backup process may read a partial transaction record -- it will be dropped automatically when the storage file is opened. Of course, your backup process may not have backed up the state at the end of the backup -- but it will have backed up the state at the start of the backup. Thus, all you can have as a problem: the effective backup may not exceed the state at it was when the backup started. I think, this should be good enough. -- Dieter
participants (8)
-
Andreas Jung -
Dieter Maurer -
Jonas Meurer -
Jonathan -
knitti -
Sam Stainsby -
Sascha Welter -
Tino Wildenhain