I have several, fairly-large sites running on my machine and at the last backup, my Data.fs file stood at 340MB. Zope itself only took up 50MB of RAM, so I didn't worry about the size of the file. Disk space is essentially infinite nowadays, right? Yes, but imagine my shock when I checked out why my backup routine took a long time and then failed. I found that Data.fs took up 33GB of space on the disk! Yikes! I must step back for a moment and salute everyone who made this possible. Everyone at Zope, Linus, Guido, Maxtor, AMD, Intel, and the manufacturers of all of the little components who have conspired to make a sticky mess like this possible. My Data.fs file grew to 33GB and I DIDN'T NOTICE. Looking over the Undo lists, I found hundreds of transactions that said simply "by Zope". Some experimentation led me to discover that these were inserted by JRedirector. It summarizes redirections, and my assumption that it did it under the Undo radar proved incorrect. I've since disabled JRedirector. Running df on my disk, I find that I've filled 74% of the 55GB data partition, so there's no chance of making a copy locally and figuring it out from there. Serendipidously, I did have a 60GB drive literally sitting on the shelf. I popped it into my Windows PC (I'd been able to get fsrecover.py to run on it, but not on the Linux box) started copying over the network This took a couple of hours. I tried to run fsrecover on the big file and it failed immediately:
python.exe fsrecover.py -v 2 -P 184000 j:\Data.fs j:\Data_recovered_all.fs Recovering j:\Data.fs into j:\Data_recovered_all.fs Traceback (most recent call last): File "fsrecover.py", line 328, in ? if __name__=='__main__': recover() File "fsrecover.py", line 221, in recover seek(0,2) IOError: [Errno 22] Invalid argument
I guess there are indeed some limits. I wrote a quick program to make a copy of the first 1,000,000,000 bytes of the file. I ran fsrecover on that without problems. This tells me that the file itself is OK. It just needs to be packed or filtered. I've got thousands of good transactions interspersed with tens of thousands of noise transactions. Questions: 1) Can I safely pack a database this huge on my live machine? 2) If that doesn't work, can I easily figure out transaction boundaries? If so, I could either chop the big file up into smaller chunks and serially attach new tails to the file, then fsrecover the whole thing. 3) If I can figure out transaction boundaries, can I filter out transactions by type? I've got approximately 32GB of JRedirector transactions that I'd be happy to remove. As always, thanks for listening and I eagerly await your wisdom. Howard Hansen http://howardsmusings.com __________________________________________________ Do you Yahoo!? Yahoo! Web Hosting - Let the expert host your site http://webhosting.yahoo.com