[ZODB-Dev] Re: Database Corruption.

Ramon Aseniero ramon.aseniero at tryarc.com
Thu Jul 1 15:25:45 EDT 2004


Hi Tim,

We got this corruption 2 or 3 times a month, it has been going on since
we launch the site last May. I am in the process of getting that
Data.fs, index and temp file to you.

Fsrecover.py throws an error when I run it against the corrupted
Data.fs, so I've resorted to removing the last transaction by truncating
the database. The problem seems to go away for another 2 weeks.
 
Yes that has been the common symptoms so far.

Yes I am running other Zope products see below. I've also installed PIL
(Python Imaging Library).

--------------------zope products----------------
CMFCalendar (Installed product CMFCalendar (CMF-1.4.2))     2004-01-25
14:33  
CMFCore (Installed product CMFCore (CMF-1.4.2))     2004-01-25 14:33  
CMFDefault (Installed product CMFDefault (CMF-1.4.2))     2004-01-25
14:33  
CMFTopic (Installed product CMFTopic (CMF-1.4.2))     2004-01-25 14:33  
DCWorkflow (Installed product DCWorkflow (CMF-1.4.2))     2004-01-25
14:33  
ExternalMethod (Installed product ExternalMethod (External
Method-1-0-0))     2004-01-23 17:09  
Hotfix_2002-06-14 (Installed product Hotfix_2002-06-14)     2004-05-13
14:13  
LeakFinder (Installed product LeakFinder (LeakFinder-0.1.1))
2004-05-24 03:29  
Localizer (Installed product Localizer (Localizer 1.0.1))     2004-02-02
23:49  
MIMETools (Installed product MIMETools)     2004-01-23 21:32  
MailHost (Installed product MailHost (MailHost-1-3-0))     2004-01-23
17:09  
MyScriptModules     2004-03-11 11:41  
OFSP (Installed product OFSP (OFSP-1-0-0))     2004-01-23 17:09  
PageTemplates (Installed product PageTemplates (PageTemplates-1-4-0))
2004-01-23 17:09  
Photo (Installed product Photo (Photo 1.2.3))     2004-04-17 18:17  
PluginIndexes (Installed product PluginIndexes)     2004-01-23 21:32  
PythonScripts (Installed product PythonScripts (PythonScripts-2-0-0))
2004-01-23 17:09  
Sessions (Installed product Sessions)     2004-01-23 17:09  
SiteAccess (Installed product SiteAccess (SiteAccess-2-0-0))
2004-01-23 17:09  
SiteErrorLog (Installed product SiteErrorLog)     2004-01-23 17:09  
StandardCacheManagers (Installed product StandardCacheManagers
(StandardCacheManagers-1-1-0))     2004-01-23 17:09  
TemporaryFolder (Installed product TemporaryFolder)     2004-01-23 17:09

Transience (Installed product Transience)     2004-01-23 17:09  
TranslationService (Installed product TranslationService (0.4.0-1))
2004-02-02 23:48  
ZCTextIndex (Installed product ZCTextIndex)     2004-01-23 17:09  
ZCatalog (Installed product ZCatalog (ZCatalog-2-2-0))     2004-01-23
17:09  
ZGadflyDA (Installed product ZGadflyDA)     2004-01-23 21:32  
ZMySQLDA (Installed product ZMySQLDA (ZMySQLDA 2.0.8))     2004-02-07
17:02  
ZSQLMethods (Installed product ZSQLMethods)     2004-01-23 17:09  
ZopeTutorial (Installed product ZopeTutorial (Zope Tutorial 1.0))  
-------------------------------------------------

Thanks,
Ramon

-----Original Message-----
From: Tim Peters [mailto:tim at zope.com] 
Sent: Thursday, July 01, 2004 12:05 PM
To: 'Ramon Aseniero'
Cc: zodb-dev at zope.org; jim at zope.com
Subject: RE: [ZODB-Dev] Re: Database Corruption.

[Ramon Aseniero]
> No I don't truncate the data.fs at a random time and size.

Yes, I was kidding about that.

> I only time I truncate it is when the database gets corrupted and the
> site crashes,

How frequently does that occur?  How long has it been going on?  Some
time
with Google suggests you've been seeing it for at least 5 weeks, but
haven't
been forthcoming about details that might help to solve it.

> but I followed this instructions
> http://www.zope.org/Members/itamar/CorruptedZODB for truncating the
> data.fs

Meaning you run fsrecover.py?  Or meaning something other/more than
that?
Do the problems go away then for some time?  If so, for how long?

> Below are more log messages from event.log
...
> CorruptedDataError:
>
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@
...
> CorruptedDataError:
>
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@
...
> CorruptedDataError:
>
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@

Has that been the most common symptom of corruption in your experience?

That output *suggests* you have unexpected blocks of NUL (0) bytes
inside
your Data.fs file.  As Jeremy said, we have seen reports of that before,
but
rarely, and they were never traced to a ZODB bug.  Some causes that are
known are outside of ZODB's control, and are sketched in this new
message
thread:

    http://mail.zope.org/pipermail/zodb-dev/2004-July/007575.html

So please read that.

Since the most likely cause of parts of a FileStorage getting
overwritten by
NUL bytes is currently believed to be HW or system SW problems (see the
thread referenced above), would it be possible for you to *try* running
your
site on a different physical machine?

Also, as Jim asked before, can we get a copy of your .fs and .index
files?

Finally, are you running any Zope products or Python extension modules
that
aren't included with the core Zope distribution?  If so, what are they?







More information about the ZODB-Dev mailing list