[ZODB-Dev] [ zodb-Bugs-953518 ] repozo.py may include dat file in recovering ZODB file

SourceForge.net noreply at sourceforge.net
Sat May 22 01:23:50 EDT 2004


Bugs item #953518, was opened at 2004-05-13 15:03
Message generated for change (Comment added) made by tim_one
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=115628&aid=953518&group_id=15628

Category: None
Group: None
Status: Closed
Resolution: Fixed
Priority: 5
Submitted By: James Henderson (henderj)
Assigned to: Tim Peters (tim_one)
Summary: repozo.py may include dat file in recovering ZODB file

Initial Comment:
I have an identical repository on two different machines.  On one 
repozo.py -R works fine; on the other it includes a .dat file and 
produces a corrupt ZODB file:

[THIS IS CORRECT]
[zope at www2 zope]$ repozo.py -v -r backup -Ro recovered.fs
looking for files b/w last full backup and 2004-05-13-17-20-50...
files needed to recover state as of 2004-05-13-17-20-50:
        backup/2004-05-13-15-03-23.fs
        backup/2004-05-13-15-07-48.deltafs
        backup/2004-05-13-15-44-06.deltafs
        backup/2004-05-13-15-48-10.deltafs
Recovering file to recovered.fs
Recovered 561743152 bytes, md5: 
6cb9f177aab9cd09fcd7792076cdcc76

[THIS INCORRECTLY INCLUDE .dat FILE]
[zope at localhost zope]$ repozo.py -v -r backup -Ro recovered.fs
looking for files b/w last full backup and 2004-05-13-17-34-24...
files needed to recover state as of 2004-05-13-17-34-24:
        backup/2004-05-13-15-03-23.fs
        backup/2004-05-13-15-03-23.dat
        backup/2004-05-13-15-07-48.deltafs
        backup/2004-05-13-15-44-06.deltafs
        backup/2004-05-13-15-48-10.deltafs
Recovering file to recovered.fs
Recovered 561743535 bytes, md5: 
85508f41b0025454b92b5fb276ca43a4

It turns out that the find_files() function can return different lists on 
different machine.  This is because os.listdir() lists file in arbitrary 
order (as per the documentation) and then the relative order of the .fs 
file and its .dat file (which share the same basename) is preserved by 
the Python sort.  The workings of find_files() are illustrated 
interactively below:

[.fs LISTED BEFORE .dat]
[zope at www2 zope]$ python
Python 2.3.3 (#2, Mar 15 2004, 10:16:17)
[GCC 3.2.2 20030222 (Red Hat Linux 3.2.2-5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os, pprint
>>> def rootcmp(x, y):
...     return cmp(os.path.splitext(y)[0], os.path.splitext(x)[0])
...
>>> all = os.listdir('backup')
>>> pprint.pprint(all)
['2004-05-13-15-03-23.fs',
 '2004-05-13-15-03-23.dat',
 '2004-05-13-15-07-48.deltafs',
 '2004-05-13-15-44-06.deltafs',
 '2004-05-13-15-48-10.deltafs']
>>> all.sort(rootcmp)
>>> pprint.pprint(all)
['2004-05-13-15-48-10.deltafs',
 '2004-05-13-15-44-06.deltafs',
 '2004-05-13-15-07-48.deltafs',
 '2004-05-13-15-03-23.fs',
 '2004-05-13-15-03-23.dat']

[.dat LISTED BEFORE .fs]
[zope at localhost zope]$ python
Python 2.3.3 (#1, Feb 13 2004, 15:14:01)
[GCC 3.2.2 20030222 (Red Hat Linux 3.2.2-5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os, pprint
>>> def rootcmp(x, y):
...     return cmp(os.path.splitext(y)[0], os.path.splitext(x)[0])
...
>>> all = os.listdir('backup')
>>> pprint.pprint(all)
['2004-05-13-15-07-48.deltafs',
 '2004-05-13-15-03-23.dat',
 '2004-05-13-15-03-23.fs',
 '2004-05-13-15-44-06.deltafs',
 '2004-05-13-15-48-10.deltafs']
>>> all.sort(rootcmp)
>>> pprint.pprint(all)
['2004-05-13-15-48-10.deltafs',
 '2004-05-13-15-44-06.deltafs',
 '2004-05-13-15-07-48.deltafs',
 '2004-05-13-15-03-23.dat',
 '2004-05-13-15-03-23.fs']

I attach a patch to fix this.

----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2004-05-22 01:23

Message:
Logged In: YES 
user_id=31435

It's fine to comment on a closed bug, but in this case it 
would be better to do so at zope.org (as noted before here, 
the StandaloneZODB project, and this tracker, are effectively 
dead).

I'm personally not interested in making repozo.py more 
forgiving of crap in the backup directory -- I'm fixing critical 
bugs for the soon-to-be Zope 2.7.1 release, and can't make 
time now for "and it would (merely) be *nice* if ..." things.

If those interest you, please open feature requests on the 
zope.org tracker.

----------------------------------------------------------------------

Comment By: James Henderson (henderj)
Date: 2004-05-21 22:08

Message:
Logged In: YES 
user_id=687101

Thanks for fixing this!  I hope it's not too bad to comment on a 
closed bug (people do it to me on my project anyway :).

I agree an include list is safer but I had only just started 
looking at the code so didn't feel qualified to compile it.  
Interestingly, I had already been bitten by my habit of creating 
recovery files in the backup directory and giving them .fs 
extensions.  Due to the same bug (as I now realize) this 
resulted in repozo doing a full, rather than incremental, backup 
every time, and your list of legal extensions wouldn't avoid 
this.  I didn't report this as a bug because I thought it was fair 
that people shouldn't add random junk to a repository, and I 
still think this is arguable, though a note in the help text might 
help.  Otherwise, a regular expression to match basenames 
created by repozo would be quite straightforward....

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2004-05-21 14:25

Message:
Logged In: YES 
user_id=31435

Repaired on the 2.7 branch, and HEAD, by filtering the file list 
to include only files with the data extensions repozo.py 
creates (.fs, .fsz, .deltafs, and .deltafsz).  This is a bit safer 
than just rejecting .dat files, assuming anything else must be 
OK (e.g., I had a .dat~ file sitting in one of my backup 
directories!).

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2004-05-13 16:27

Message:
Logged In: YES 
user_id=31435

You're lucky I'm still subscribed to this list <wink>.  The 
StandaloneZODB project is effectively dead.  ZODB is very 
much alive, but all development takes place at zope.org now.  
I opened a bug in its tracker to point to this bug:

repozo.py -R can create corrupt .fs
http://zope.org/Collectors/Zope/1330


----------------------------------------------------------------------

Comment By: James Henderson (henderj)
Date: 2004-05-13 15:05

Message:
Logged In: YES 
user_id=687101

Oops. here's the patch.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=115628&aid=953518&group_id=15628



More information about the ZODB-Dev mailing list