RE: [Zope] Data.fs falls into 33GB hole
Hmm... I heard somewhere that packing a filestorage used memory proportional to the storage size. I would consider posting to the ZODB-dev list, perhaps for some more pointers on how to get out of this? Sean -----Original Message----- From: Tino Wildenhain [mailto:tino@wildenhain.de] Sent: Thursday, November 14, 2002 5:15 PM To: Howard Hansen; zope@zope.org Subject: Re: [Zope] Data.fs falls into 33GB hole Hi Howard, why fsrecover? Your Data.fs is big but should not be corrupt. I would even suspect if you simply pack it, it should go to around 360-400MB again. Zope packs by renaming it and then copys the most recent objects over. So you only need the space for actual Data.fs + Data.fs after packing. Your last backup was at 360MB, so the packet Version schould be at this level. Since you have a backup on your Windows box, you dont have to worry, I'd say. HTH Tino Wildenhain --On Donnerstag, 14. November 2002 13:36 -0800 Howard Hansen <howardahansen@yahoo.com> wrote:
I have several, fairly-large sites running on my machine and at the last backup, my Data.fs file stood at 340MB. Zope itself only took up 50MB of RAM, so I didn't worry about the size of the file. Disk space is essentially infinite nowadays, right?
Yes, but imagine my shock when I checked out why my backup routine took a long time and then failed. I found that Data.fs took up 33GB of space on the disk! Yikes!
I must step back for a moment and salute everyone who made this possible. Everyone at Zope, Linus, Guido, Maxtor, AMD, Intel, and the manufacturers of all of the little components who have conspired to make a sticky mess like this possible.
My Data.fs file grew to 33GB and I DIDN'T NOTICE.
Looking over the Undo lists, I found hundreds of transactions that said simply "by Zope". Some experimentation led me to discover that these were inserted by JRedirector. It summarizes redirections, and my assumption that it did it under the Undo radar proved incorrect. I've since disabled JRedirector.
Running df on my disk, I find that I've filled 74% of the 55GB data partition, so there's no chance of making a copy locally and figuring it out from there. Serendipidously, I did have a 60GB drive literally sitting on the shelf. I popped it into my Windows PC (I'd been able to get fsrecover.py to run on it, but not on the Linux box) started copying over the network This took a couple of hours. I tried to run fsrecover on the big file and it failed immediately:
python.exe fsrecover.py -v 2 -P 184000 j:\Data.fs j:\Data_recovered_all.fs Recovering j:\Data.fs into j:\Data_recovered_all.fs Traceback (most recent call last): File "fsrecover.py", line 328, in ? if __name__=='__main__': recover() File "fsrecover.py", line 221, in recover seek(0,2) IOError: [Errno 22] Invalid argument
I guess there are indeed some limits. I wrote a quick program to make a copy of the first 1,000,000,000 bytes of the file. I ran fsrecover on that without problems. This tells me that the file itself is OK. It just needs to be packed or filtered.
I've got thousands of good transactions interspersed with tens of thousands of noise transactions.
Questions:
1) Can I safely pack a database this huge on my live machine? 2) If that doesn't work, can I easily figure out transaction boundaries? If so, I could either chop the big file up into smaller chunks and serially attach new tails to the file, then fsrecover the whole thing. 3) If I can figure out transaction boundaries, can I filter out transactions by type? I've got approximately 32GB of JRedirector transactions that I'd be happy to remove.
As always, thanks for listening and I eagerly await your wisdom.
Howard Hansen http://howardsmusings.com
__________________________________________________ Do you Yahoo!? Yahoo! Web Hosting - Let the expert host your site http://webhosting.yahoo.com
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
Well, I've got 128GB of RAM on the server, so it wouldn't matter. Joking! It worked just fine. I watched the start of the process on top. It ate less CPU than I expected and didn't appreciably affect RAM usage. So, from empirical evidence, I'd say that if packing uses RAM in proportion to the db size, it's only a few KB/GB. YMMV, though. My Data.fs was filled with an N-square bunch of transactions each superceding its predecessor. It had a very low signal/noise ratio. I don't know how it would have worked if it were clearing 350MB of transactions out of a 33GB database. Howard Hansen http://howardsmusings.com --- sean.upton@uniontrib.com wrote:
Hmm... I heard somewhere that packing a filestorage used memory proportional to the storage size. I would consider posting to the ZODB-dev list, perhaps for some more pointers on how to get out of this?
Sean
__________________________________________________ Do you Yahoo!? Yahoo! Web Hosting - Let the expert host your site http://webhosting.yahoo.com
Hi, if I get the theory right it worked like expected: even the memory usage can not be much more then the resulting Data.fs - which we could expect would not be more then 360MB. That is, during the packing process, ZODB goes from top to bottom of the Data.fs, locating any object and keep it say... in a list. Every further occurence of the object lets ZODB skip over it because the most recent version of the object must have been on top and is in the list already. HTH Tino Wildenhain --On Freitag, 15. November 2002 00:40 -0800 Howard Hansen <howardahansen@yahoo.com> wrote:
Well, I've got 128GB of RAM on the server, so it wouldn't matter.
Joking!
It worked just fine. I watched the start of the process on top. It ate less CPU than I expected and didn't appreciably affect RAM usage. So, from empirical evidence, I'd say that if packing uses RAM in proportion to the db size, it's only a few KB/GB.
YMMV, though. My Data.fs was filled with an N-square bunch of transactions each superceding its predecessor. It had a very low signal/noise ratio. I don't know how it would have worked if it were clearing 350MB of transactions out of a 33GB database.
Howard Hansen http://howardsmusings.com
--- sean.upton@uniontrib.com wrote:
Hmm... I heard somewhere that packing a filestorage used memory proportional to the storage size. I would consider posting to the ZODB-dev list, perhaps for some more pointers on how to get out of this?
Sean
__________________________________________________ Do you Yahoo!? Yahoo! Web Hosting - Let the expert host your site http://webhosting.yahoo.com
On Friday 15 November 2002 3:07 am, sean.upton@uniontrib.com wrote:
Hmm... I heard somewhere that packing a filestorage used memory proportional to the storage size.
Memory usage is roughly proportional to the number of objects in the filestorage - in normal operation as well as during packing. It sounds like this storage contains only a few different objects, but many many revisions.
participants (4)
-
Howard Hansen -
sean.upton@uniontrib.com -
Tino Wildenhain -
Toby Dickenson