Thanks Tim and Chris! - You were both dead on in so many respects.


Chris, There <italic>was</italic> a correlation with flip in the logs


Tim: looks like once a None occurs it sticks.


Source of the problem was an old cache file (1-None-1) that could not
be overwritten by the current uid running Zope. (That's the second
stupid permissions problem in as many days. My advice - if you have
any kind of cache problems occurring after an initial period, look
hard at permissions. Secondly - don't be sloppy and run zope first as
one user then as another. Thirdly grep the event logs for IOError
first. That's the easiest thing to fix.) 


more comments below...


On 23 Jul 2004, at 02:14, Tim Peters wrote:


<excerpt>[Chris McDonough]

<excerpt>...

I think the problem is related to ZEO client storage "cache flips".

</excerpt>

..

anyway, pointing to None.  But this code gives me a headache, and I'm

not sure that can actually happen (despite that I hear you guys saying

it is <<wink>).

</excerpt>I was slow interpreting all of this due to separate event
files.

This piece of bash script was helpful to get a grip of time order
between Clients:

    

     grep ZEC ev*.log | sed 's/:/ /' | sort -k 2 >
interleaved.ZEC.events.log


a multi-line merge script would be helpful. I'll bet someone's created
one.


<excerpt>

<excerpt>But then I turned off persistent ZEO client cachefile storage
(but omitting

the "zeo-client-name" parameter from zope.conf), believing this would
be a

workaround, but it hasn't been.  I gave up at that point and that's
where I am now.

</excerpt>

Did you continue to get errors in the log near cache-flip times?  I

don't see a way for checkSize() to screw up unless an unexpected

exception is raised.

</excerpt>Bingo!

2004-07-21T11:35:07 INFO(0) ZEC:1-None-1 flipping cache files.  new
current = 1

------

2004-07-21T11:35:07 ERROR(200) ZODB Couldn't load state for
0000000000012323

Traceback (most recent call last):

  File "/usr/local/zope/zope271/lib/python/ZODB/Connection.py", line
559, in setstate

    p, serial = self._storage.load(oid, self._version)

  File "/usr/local/zope/zope271/lib/python/ZEO/ClientStorage.py", line
754, in load

    self._cache.checkSize(0)

  File "/usr/local/zope/zope271/lib/python/ZEO/ClientCache.py", line
581, in checkSize

    self._f[current] = open(self._p[current],'w+b')

IOError: [Errno 13] Permission denied: '1-None-1'


<excerpt>

<excerpt>...

My theory is that it will happen as often as a Zope client's ZEO
client storage

needs to flip its cache file.  The cache file is only flipped when it
exceeds a

certain size and it only exceeds a certain size after a certain
pattern of usage

causes it to do so (lots of loads from the database of new items,
typically).

</excerpt></excerpt>Yup!


<excerpt>

It appears that once self._f[self._current] is None, all future

attempts by ZEO to store into its client cache will fail the same way.

 So I'd be even more surprised if you saw just one of these occur.

</excerpt>Yup!


<excerpt>

<excerpt>It would be nice if you could confirm this.  Reading the Zope
event log

file of the client that generated the error would be a good start.

</excerpt>

The log is everything here.  The ZEO client cache logs most relevant

messages at info level, producing msgs starting with "ZEC".


</excerpt>

Thanks again for the insights.


--r.

—————————————————————

Russ Ferriday<italic>

</italic><bold>Solution Workshops </bold>for Plone<bold> </bold>

(+44) (0) 7789 338868

http://www.solutionworkshops.com

