[Zope-dev] Coroner's toolkit for zope, or how to figure out what
went wrong.
Romain Slootmaekers
romain@zzict.com
Mon, 12 Aug 2002 20:10:56 +0200
Jim Fulton wrote:
> Romain Slootmaekers wrote:
>
>> Yo,
>>
>> we had a nasty crash of our zope server that we use for a b2b web
>> application. The Data.fs ZODB lost a significant amount of data.
>
>
> What sort of crash? Was this a hardware failure, or a software failure?
software.
basically, the server didn't crash, but our applications couldn't
function anymore because some objects that really have to exist
were gone.
the Data.fs was NOT corrupted,
but (so far I can tell) a bug in the conflict resolution code caused
our object (the one upon we set self._p_changed=1) to be empty. This
object is a container of other objects that are Persistent themselves
and at this point, we don't believe the conflict resolution mechanism
handles these cases correctly.
>
>> At this point, we restored the Data.fs from our last backup and the
>> server is back up and running. (breathing relieved)
>>
>> What worries me is that we have no clue whatsoever on what happened,
>> besides the constatation that somehow, somewhere we lost a whole tree
>> of objects.
>
>
> Was this in the backup? Or in the damaged data file?
nope. the loss of data occured in the 12 hours after our last backup.
so we only (well, it actually is quite a lot :( ) lost the transactions
that happened between the backup and the restore/restart.
The stack trace in the follow up mail gives some clue on where the
problem is situated in the code. (as well as the exact version of the
Zope installation)
Anyway, Murphy's law is once again proven as this thing happened on the
first day of my vacation. :|
Sloot.