Re: [Zope] CorruptTransactionError (Bad news for production site!)
Jim Thank you for your prompt response. This is what I love about using Open Source Software, the responses come from peole who really know what they are talking about. Further responses in-line. Jim Fulton <jim@digicool.com> writes:
Richard Taylor wrote:
Today I had to role back two days of transactions from my production site because when I packed the database I was informed of a CorruptTransactionError.
Did anything else happen previous to this? Did you run out of space or anything like that?
We had been doing some extensive development work and the ZDB had reached about 2Gbytes, but the disk was not full. I packed the database (down to 10M approx.) without any trouble. We then carried on using the system for another two days and I then packed the database again. This time I got the CorruptTransactionError. I followed the instruction to truncate the database and successfully recovered it. Close examination of the bobo_modification_times on the objects left in the database showed that the error occurred at about the time of the first pack.
You should have been able to use Data.fs.old, which is a copy of the database before the pack to restore the data. Or was the error in there too? I'd be interested in looking at the Data.fs file before the pack to try to figure out what might have gone wrong.
Unfortunately the error occurred after (or during the first pack) the second pack over-rote the Data.fs.old with the corrupt database. The real problem was that the corrupt transaction did not have an immediate affect.
(If you send my your Data.fs file, please remember to send it to me privately and to zip or compress it. :)
I would love to send you the Data.fs file but unfortunately it contents sensitive commercial information for my company and I would be sacked for sending it out. I know how difficult it is to track down bugs when people will not give you repeatable examples, but I just can't send this stuff out.
We are using Zope for an internal knowledge management application where anyone in the organization can add objects. So I have no way of know what was added after the fateful transaction and no way of getting any of it back.
Bummer!
Indeed.
I think this raises a few questions about ZDB:
1) We need some tools for selectively removing bad transactions rather than just truncating Data.fs back to the last good one and loosing everything that comes after it.
Zope 2.2 has just such a tool. In the ZODB directory, there is a Python script, fsrecover.py which simply calls the recover function in the FileStorage module. This will work with any 2.x databases. It scans from both the beginning and the end of the file until it finds a corrupted section and then removes the corrupted portion from the file. You utility modifies the file in-place, so you need to shut-down the site, or work on a copy when you use it.
Fantastic! this is exactly what I was banging on about. No if only I had not deleted the original corrupt Data.fs file out of discussed I would be able to get back my stuff (I think I need a serious talking to.)
2) We could do with a tools that can verify the ZODB offline. This could then be run at regular intervals (maybe once an hour from cron) so that corruptions can be picked up earlier.
You could use a little Python script that did something like:
import ZODB.FileStorage file_name='../../var/Data.fs' file=open(file_name, 'r+b') index={} vindex={} tindex=[] ZODB.FileStorage.read_index( file, file_name, index, vindex, tindex)
This basically reads the FileStorage index as would normally be done during startup.
I shall be installing (and testing) this tonight!
3) Some way to find out what was added after a corrupt transaction is needed so that at least I could see who I need to ask to re-add their stuff.
The fsrecover script should avoid the need for this.
Agreed.
Jim
Richard
participants (1)
-
Richard Taylor