I have just sent a message to the list asking how to fix my corrupt Data.fs. Now I am wondering how I got into this mess. Until now I had always reassured myself that I can regnerate the Data.fs for any day in the past month. But it does not seem that is going to help me very much. My log file tells me that an instance of Zope came up on October 22 with one ConflictError and several CorruptedDataErrors. This Zope (or maybe another instance [1]) kept serving requests on the Data.fs for two weeks until the server crashed again. I now think that I will not be able to recover the data from these requests unless I plough through the source code to decipher the ZODB format and then cut and paste the transactions into a usable Data.fs by hand. So I have two questions for the list: 1) Is it, indeed, possible for Zope to keep serving requests from a Data.fs that will not be usable when Zope restarts? 2) How do people sleep at night knowing this to be the case? --- Alastair Footnotes: [1] I think my scripts to automatically start up Zope in event of a crash may have led to two instances of Zope trying to serve the same Data.fs.
On Mon, Nov 05, 2001 at 05:18:19PM +0100, Alastair Burt wrote:
Footnotes: [1] I think my scripts to automatically start up Zope in event of a crash may have led to two instances of Zope trying to serve the same Data.fs.
Zope exits on startup if it can't lock the database. The code that handles this is in lib/python/ZODB/lock_file.py. Having looked at it, I must say it doesn't exactly make me feel safe. Looks like it should work on unix and windows, but if it should fail to define a working lock_file function it silently defines lock_file as a no-op. Is it just me, or is this dangerous? It does seem to work properly on linux at least. Example: [pw@roaddog Zope]$ ./start & [1] 6959 [pw@roaddog Zope]$ ./start & [2] 6964 [pw@roaddog Zope]$ [2]+ Done ./start A look at my log file shows this: 2001-11-05T16:26:58 PANIC(300) z2 Startup exception Traceback (innermost last): (snip) StorageSystemError: Could not lock the database file. There must be another process that has opened the file. -- paul winkler home: http://www.slinkp.com music: http://www.reacharms.com calendars: http://www.calendargalaxy.com
Alastair Burt writes:
1) Is it, indeed, possible for Zope to keep serving requests from a Data.fs that will not be usable when Zope restarts? Sure: Zope usually is interested only in some transaction records from the ZODB (the current versions). It maintains them in an internal data structure. When later a transaction record is corrupted, Zope can continue to work but after a restart reconstruction of this data structure is no longer possible...
2) How do people sleep at night knowing this to be the case? I would expect that with good hardware it should be very rare that file content changes automatically. I have seen 2 cases in 14 years... .... Footnotes: [1] I think my scripts to automatically start up Zope in event of a crash may have led to two instances of Zope trying to serve the same Data.fs. It should not be possible that two Zopes work on the same "Data.fs" because Zope locks this file. Exceptions may be when "Data.fs" is share across processors and cross processor locks are not handles correctly.
Dieter
participants (3)
-
Alastair Burt -
Dieter Maurer -
Paul Winkler