ERROR(200) ZODB Couldn't load state for... Memory problem?
Hi all, On OSX 10.3 (Pahter), Python 2.3.4, Zope 2.7.1, I'm frequently seeing this.. 2004-07-22T14:00:24 ERROR(200) ZODB Couldn't load state for 00000000000016ad Traceback (most recent call last): File "/usr/local/zope/zope271/lib/python/ZODB/Connection.py", line 559, in setstate p, serial = self._storage.load(oid, self._version) File "/usr/local/zope/zope271/lib/python/ZEO/ClientStorage.py", line 755, in load self._cache.store(oid, p, s, v, pv, sv) File "/usr/local/zope/zope271/lib/python/ZEO/ClientCache.py", line 601, in store self._store(oid, p, s, version, pv, sv) File "/usr/local/zope/zope271/lib/python/ZEO/ClientCache.py", line 631, in _store f.seek(self._pos) AttributeError: 'NoneType' object has no attribute 'seek' It's intermittent at first, then back-to-back. Following thread leads me to look for memory shortage. But I'd expect a different message in that case. http://www.mail-archive.com/zope-dev@zope.org/msg15039.html This thread seems to suggest such messages (in different context) can be ignored. But I can't do that. This is a symptom of a bigger problem. http://mail.zope.org/pipermail/zope/2001-August/097300.html Has anyone seen this in these circumstances? --r. ————————————————————— Russ Ferriday Solution Workshops for Plone (+44) (0) 7789 338868 http://www.solutionworkshops.com
I have seen this as well but I haven't been able to pin it down. On Thu, 2004-07-22 at 16:16, Russ Ferriday wrote:
Hi all,
On OSX 10.3 (Pahter), Python 2.3.4, Zope 2.7.1, I'm frequently seeingthis.. 2004-07-22T14:00:24 ERROR(200) ZODB Couldn't load state for00000000000016ad Traceback (most recent call last): File "/usr/local/zope/zope271/lib/python/ZODB/Connection.py", line559, in setstate p, serial = self._storage.load(oid, self._version) File "/usr/local/zope/zope271/lib/python/ZEO/ClientStorage.py", line755, in load self._cache.store(oid, p, s, v, pv, sv) File "/usr/local/zope/zope271/lib/python/ZEO/ClientCache.py", line601, in store self._store(oid, p, s, version, pv, sv) File "/usr/local/zope/zope271/lib/python/ZEO/ClientCache.py", line631, in _store f.seek(self._pos) AttributeError: 'NoneType' object has no attribute 'seek'
It's intermittent at first, then back-to-back.
Following thread leads me to look for memory shortage. But I'd expecta different message in that case. http://www.mail-archive.com/zope-dev@zope.org/msg15039.html
This thread seems to suggest such messages (in different context) canbe ignored. But I can't do that. This is a symptom of a bigger problem. http://mail.zope.org/pipermail/zope/2001-August/097300.html
Has anyone seen this in these circumstances? --r.
————————————————————— Russ Ferriday Solution Workshops for Plone (+44) (0) 7789 338868 http://www.solutionworkshops.com
______________________________________________________________________ _______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Thanks, Chris. We're fronting two Zopes with Pound for load balancing. I can't see there would be any connection with this error. But I mention it in case there's a pattern. Also, the database came from 2.7.0 and an earlier python. This looks like resource starvation to me. The servers run for an hour or two this problems shows up, becoming worse for a while, then we need a restart. I'm try to debug it tomorrow. What would you look for? Any tips for resource monitoring? --r. On 22 Jul 2004, at 21:49, Chris McDonough wrote:
I have seen this as well but I haven't been able to pin it down.
On Thu, 2004-07-22 at 16:16, Russ Ferriday wrote:
Hi all,
On OSX 10.3 (Pahter), Python 2.3.4, Zope 2.7.1, I'm frequently seeingthis.. 2004-07-22T14:00:24 ERROR(200) ZODB Couldn't load state for00000000000016ad Traceback (most recent call last): File "/usr/local/zope/zope271/lib/python/ZODB/Connection.py", line559, in setstate p, serial = self._storage.load(oid, self._version) File "/usr/local/zope/zope271/lib/python/ZEO/ClientStorage.py", line755, in load self._cache.store(oid, p, s, v, pv, sv) File "/usr/local/zope/zope271/lib/python/ZEO/ClientCache.py", line601, in store self._store(oid, p, s, version, pv, sv) File "/usr/local/zope/zope271/lib/python/ZEO/ClientCache.py", line631, in _store f.seek(self._pos) AttributeError: 'NoneType' object has no attribute 'seek'
It's intermittent at first, then back-to-back.
Following thread leads me to look for memory shortage. But I'd expecta different message in that case. http://www.mail-archive.com/zope-dev@zope.org/msg15039.html
This thread seems to suggest such messages (in different context) canbe ignored. But I can't do that. This is a symptom of a bigger problem. http://mail.zope.org/pipermail/zope/2001-August/097300.html
Has anyone seen this in these circumstances? --r.
————————————————————— Russ Ferriday Solution Workshops for Plone (+44) (0) 7789 338868 http://www.solutionworkshops.com
______________________________________________________________________ _______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
————————————————————— Russ Ferriday Solution Workshops for Plone (+44) (0) 7789 338868 http://www.solutionworkshops.com
On Thu, 2004-07-22 at 18:51, Russ Ferriday wrote:
Thanks, Chris. We're fronting two Zopes with Pound for load balancing. I can't seethere would be any connection with this error. But I mention it incase there's a pattern.
Well, I am too in this particular case, but I doubt the frontend matters too much here.
Also, the database came from 2.7.0 and an earlier python.
FTR that's also the case for me as well, but again I don't think that matters here.
This lookslike resource starvation to me. The servers run for an hour or twothis problems shows up, becoming worse for a while, then we need arestart.
I think the problem is related to ZEO client storage "cache flips". I found an error in the log near the time of the "None has no attribute seek" symptom indicating that the Zope process tried to "flip" a ZEO cache file (by creating a new file) but UNIX file system permissions apparently prevented it. But then I turned off persistent ZEO client cachefile storage (but omitting the "zeo-client-name" parameter from zope.conf), believing this would be a workaround, but it hasn't been. I gave up at that point and that's where I am now.
I'm try to debug it tomorrow. What would you look for? Any tips forresource monitoring?
I think it's resource-related only tangentially; it's a genuine bug that only happens intermittently. My theory is that it will happen as often as a Zope client's ZEO client storage needs to flip its cache file. The cache file is only flipped when it exceeds a certain size and it only exceeds a certain size after a certain pattern of usage causes it to do so (lots of loads from the database of new items, typically). It would be nice if you could confirm this. Reading the Zope event log file of the client that generated the error would be a good start. - C
[Chris McDonough]
... I think the problem is related to ZEO client storage "cache flips".
Me too. The 3.2 ZEO cache alternates between two cache files, in the two-element list self._f. Both elements are initialized to None, and the index of the current file in use (0 or 1) is in self._current. There's an implicit assumption throughout the code that self._f[self._current] is always an actual file object, but the "flip logic" is excruciating and that global invariant certainly isn't self-evident.
I found an error in the log near the time of the "None has no attribute seek" symptom indicating that the Zope process tried to "flip" a ZEO cache file (by creating a new file) but UNIX file system permissions apparently prevented it.
This was a traceback ending somewhere in ClientCache.checkSize()? That's where a cache flip happens. It changes its idea of self._current (from 0 to 1 or from 1 to 0) *before* making sure there's an actual file object in "the other" self._f slot. So, e.g., if self._f started life as [good_file_object, None] and self._current started at 0, and it came time for a cache flip, and a new file object couldn't be created in self._f[1], self._current would end up as 1 anyway, pointing to None. But this code gives me a headache, and I'm not sure that can actually happen (despite that I hear you guys saying it is <wink>).
But then I turned off persistent ZEO client cachefile storage (but omitting the "zeo-client-name" parameter from zope.conf), believing this would be a workaround, but it hasn't been. I gave up at that point and that's where I am now.
Did you continue to get errors in the log near cache-flip times? I don't see a way for checkSize() to screw up unless an unexpected exception is raised.
... My theory is that it will happen as often as a Zope client's ZEO client storage needs to flip its cache file. The cache file is only flipped when it exceeds a certain size and it only exceeds a certain size after a certain pattern of usage causes it to do so (lots of loads from the database of new items, typically).
It appears that once self._f[self._current] is None, all future attempts by ZEO to store into its client cache will fail the same way. So I'd be even more surprised if you saw just one of these occur.
It would be nice if you could confirm this. Reading the Zope event log file of the client that generated the error would be a good start.
The log is everything here. The ZEO client cache logs most relevant messages at info level, producing msgs starting with "ZEC".
Thanks Tim and Chris! - You were both dead on in so many respects. Chris, There was a correlation with flip in the logs Tim: looks like once a None occurs it sticks. Source of the problem was an old cache file (1-None-1) that could not be overwritten by the current uid running Zope. (That's the second stupid permissions problem in as many days. My advice - if you have any kind of cache problems occurring after an initial period, look hard at permissions. Secondly - don't be sloppy and run zope first as one user then as another. Thirdly grep the event logs for IOError first. That's the easiest thing to fix.) more comments below... On 23 Jul 2004, at 02:14, Tim Peters wrote:
[Chris McDonough]
... I think the problem is related to ZEO client storage "cache flips".
.. anyway, pointing to None. But this code gives me a headache, and I'm not sure that can actually happen (despite that I hear you guys saying it is <wink>). I was slow interpreting all of this due to separate event files. This piece of bash script was helpful to get a grip of time order between Clients:
grep ZEC ev*.log | sed 's/:/ /' | sort -k 2 > interleaved.ZEC.events.log a multi-line merge script would be helpful. I'll bet someone's created one.
But then I turned off persistent ZEO client cachefile storage (but omitting the "zeo-client-name" parameter from zope.conf), believing this would be a workaround, but it hasn't been. I gave up at that point and that's where I am now.
Did you continue to get errors in the log near cache-flip times? I don't see a way for checkSize() to screw up unless an unexpected exception is raised.
Bingo! 2004-07-21T11:35:07 INFO(0) ZEC:1-None-1 flipping cache files. new current = 1 ------ 2004-07-21T11:35:07 ERROR(200) ZODB Couldn't load state for 0000000000012323 Traceback (most recent call last): File "/usr/local/zope/zope271/lib/python/ZODB/Connection.py", line 559, in setstate p, serial = self._storage.load(oid, self._version) File "/usr/local/zope/zope271/lib/python/ZEO/ClientStorage.py", line 754, in load self._cache.checkSize(0) File "/usr/local/zope/zope271/lib/python/ZEO/ClientCache.py", line 581, in checkSize self._f[current] = open(self._p[current],'w+b') IOError: [Errno 13] Permission denied: '1-None-1'
... My theory is that it will happen as often as a Zope client's ZEO client storage needs to flip its cache file. The cache file is only flipped when it exceeds a certain size and it only exceeds a certain size after a certain pattern of usage causes it to do so (lots of loads from the database of new items, typically).
Yup!
It appears that once self._f[self._current] is None, all future attempts by ZEO to store into its client cache will fail the same way. So I'd be even more surprised if you saw just one of these occur.
Yup!
It would be nice if you could confirm this. Reading the Zope event log file of the client that generated the error would be a good start.
The log is everything here. The ZEO client cache logs most relevant messages at info level, producing msgs starting with "ZEC".
Thanks again for the insights. --r. ————————————————————— Russ Ferriday Solution Workshops for Plone (+44) (0) 7789 338868 http://www.solutionworkshops.com
Thanks a lot for the confirmation! I haven't seen this problem crop up in several days myself, but that may just be due to the fact that the ZEO cache hasn't flipped. (Tim, yes, when it does happen, it only happens once; then the server is restarted by my customer to get the site going again, so it's a bit difficult to tell what's happening, although I am trying to "catch" it in real time). I'd like to chalk this up to improper permission settings; I just need to figure out how to do that in my case because I start Zope from an rc script as the "zope" user and my customer and myself only log in and manage Zope-related processes as the zope user. In fact, to avoid this sort of problem, I actually have a Zope startup regiment that prevents it from possibly even beginning to start as the root user or another user via a wrapper script. Unless someone is doing something behind my back, which is of course possible. Here's a demonstrative traceback: 2004-07-13T12:01:29 INFO(0) ZEC:1-None-1 flipping cache files. new current = 1 ------ 2004-07-13T12:01:29 PANIC(300) ZODB A storage error occurred in the last phase of a two-phase commit. This shouldn't happen. The application will not be allowed to commit until the site/storage is reset by a restart. Traceback (most recent call last): File "/home/zope/dhportal/opt/Zope-2.7.1/lib/python/ZODB/Transaction.py", line 372, in _finish_many jar.tpc_finish(self) File "/home/zope/dhportal/opt/Zope-2.7.1/lib/python/ZODB/Connection.py", line 763, in tpc_finish self._storage.tpc_finish(transaction, callback) File "/home/zope/dhportal/opt/Zope-2.7.1/lib/python/ZEO/ClientStorage.py", line 927, in tpc_finish self._update_cache() File "/home/zope/dhportal/opt/Zope-2.7.1/lib/python/ZEO/ClientStorage.py", line 953, in _update_cache self._cache.checkSize(self._tbuf.get_size()) File "/home/zope/dhportal/opt/Zope-2.7.1/lib/python/ZEO/ClientCache.py", line 581, in checkSize self._f[current] = open(self._p[current],'w+b') IOError: [Errno 13] Permission denied: '1-None-1' One minor thing I noticed: even when a ZEO client is started with settings that should prevent it from using persistent cache files (namely, failing to set a 'zeo-client-name' in zope.conf), the logfile still reports: 2004-07-23T04:02:18 INFO(0) ZEC:1-None-0 ClientCache: storage='1', size=100000000; file[0]='1-None-0' This message is a bit puzzling when you're running without persistent cache files; it makes you think that a file named "1-None-0" is literally trying to be written. This is not the case (it actually goes into a tempfile somewhere). I can imagine a case where some code... Aha. Wait, I think I see what the problem is. It's related to this. The ZEO ClientCache.checkSize code assumes that if there is a non-None value in self._f[current] that you're running with persistent cache files. However, this can also be the case if you're running without persistent cache files; the __init__ method of ClientCache does this in that circumstance: self._f = f = [tempfile.TemporaryFile(suffix='.zec'), None] # If they are temp files just use illustrative names so # that the log messages describe the cache. self._p = p = ["%s-%s-0" % (storage, client), "%s-%s-1" % (storage, client)] f[0].write(magic + '\0' * (headersize - len(magic))) current = 0 (and self._current is set to local current later) So the code in checkSize that tries this when a cache flip is necessary: current = not self._current ... if self._p[current] is not None: # Persistent cache file: remove the old file # before opening the new one, because the old file # may be owned by root (created before setuid()). self._f[current].close() try: os.remove(self._p[current]) except: pass self._f[current] = open(self._p[current],'w+b') ... will be likely to fail at the last line if you're using nonpersistent cache files, because self._p[current] is (bogus) '1-None-0' (relative bogus filename). There should probably be a _using_persistent_cache flag attr rather than trying to inspect self._p to find out if we're using persistent caches. I may try to work up a patch + test for this later. On Fri, 2004-07-23 at 10:44, Russ Ferriday wrote:
Thanks Tim and Chris! - You were both dead on in so many respects.
Chris, There was a correlation with flip in the logs
Tim: looks like once a None occurs it sticks.
Source of the problem was an old cache file (1-None-1) that could notbe overwritten by the current uid running Zope. (That's the secondstupid permissions problem in as many days. My advice - if you haveany kind of cache problems occurring after an initial period, lookhard at permissions. Secondly - don't be sloppy and run zope first asone user then as another. Thirdly grep the event logs for IOErrorfirst. That's the easiest thing to fix.)
more comments below...
On 23 Jul 2004, at 02:14, Tim Peters wrote:
[Chris McDonough] ... I think the problem is related to ZEO client storage "cache flips".
.. anyway, pointing to None. But this code gives me a headache, and I'm not sure that can actually happen (despite that I hear you guys saying it is <wink>). I was slow interpreting all of this due to separate eventfiles. This piece of bash script was helpful to get a grip of time orderbetween Clients:
grep ZEC ev*.log | sed 's/:/ /' | sort -k 2
interleaved.ZEC.events.log
a multi-line merge script would be helpful. I'll bet someone's createdone.
But then I turned off persistent ZEO client cachefile storage(but omitting the "zeo-client-name" parameter from zope.conf), believing this wouldbe a workaround, but it hasn't been. I gave up at that point and that'swhere I am now.
Did you continue to get errors in the log near cache-flip times? I don't see a way for checkSize() to screw up unless an unexpected exception is raised. Bingo! 2004-07-21T11:35:07 INFO(0) ZEC:1-None-1 flipping cache files. newcurrent = 1 ------ 2004-07-21T11:35:07 ERROR(200) ZODB Couldn't load state for0000000000012323 Traceback (most recent call last): File "/usr/local/zope/zope271/lib/python/ZODB/Connection.py", line559, in setstate p, serial = self._storage.load(oid, self._version) File "/usr/local/zope/zope271/lib/python/ZEO/ClientStorage.py", line754, in load self._cache.checkSize(0) File "/usr/local/zope/zope271/lib/python/ZEO/ClientCache.py", line581, in checkSize self._f[current] = open(self._p[current],'w+b') IOError: [Errno 13] Permission denied: '1-None-1'
... My theory is that it will happen as often as a Zope client's ZEOclient storage needs to flip its cache file. The cache file is only flipped when itexceeds a certain size and it only exceeds a certain size after a certainpattern of usage causes it to do so (lots of loads from the database of new items,typically). Yup!
It appears that once self._f[self._current] is None, all future attempts by ZEO to store into its client cache will fail the same way. So I'd be even more surprised if you saw just one of these occur. Yup!
It would be nice if you could confirm this. Reading the Zopeevent log file of the client that generated the error would be a good start.
The log is everything here. The ZEO client cache logs most relevant messages at info level, producing msgs starting with "ZEC".
Thanks again for the insights.
--r. ————————————————————— Russ Ferriday Solution Workshops for Plone (+44) (0) 7789 338868 http://www.solutionworkshops.com
On 23 Jul 2004, at 17:00, Chris McDonough wrote:
Thanks a lot for the confirmation!
A pint from me the next time I see you. And Tim! The next item is this message - I see many - and need to restart the instance that's spewing them: 2004-07-23T08:21:07 INFO(0) ZEC:1-None-1 load: bad record for oid 0000000000001c8c at position 2082264 in cac he file 1 This problem also occurs after a few hours. Is this caused because I'm on a MP box, with two clients, sharing the same directory for cache files?? if so, I can't see immediately how to put the files elsewhere, per Client. --r. ————————————————————— Russ Ferriday Solution Workshops for Plone (+44) (0) 7789 338868 http://www.solutionworkshops.com
Russ Ferriday wrote:
On 23 Jul 2004, at 17:00, Chris McDonough wrote:
Thanks a lot for the confirmation!
A pint from me the next time I see you. And Tim!
The next item is this message - I see many - and need to restart the instance that's spewing them: 2004-07-23T08:21:07 INFO(0) ZEC:1-None-1 load: bad record for oid 0000000000001c8c at position 2082264 in cac he file 1
This problem also occurs after a few hours.
Is this caused because I'm on a MP box, with two clients, sharing the same directory for cache files??
if so, I can't see immediately how to put the files elsewhere, per Client.
(Untested): # zope.conf.appserver1 %include zope.conf.common clienthome /path/to/zope/var/appserver1 # zope.conf.appserver2 %include zope.conf.common clienthome /path/to/zope/var/appserver2 Tres. -- =============================================================== Tres Seaver tseaver@zope.com Zope Corporation "Zope Dealers" http://www.zope.com
Thanks Tres, That looks to be about it. We'll limp through today, and try this on the weekend. This will cause a couple of other changes and I don't have the nerve for it at 7PM on a Friday night. I'll let you know, and write up these last few lessons on Zope.org. --r. On 23 Jul 2004, at 18:35, Tres Seaver wrote:
Russ Ferriday wrote:
On 23 Jul 2004, at 17:00, Chris McDonough wrote:
Thanks a lot for the confirmation! A pint from me the next time I see you. And Tim! The next item is this message - I see many - and need to restart the instance that's spewing them: 2004-07-23T08:21:07 INFO(0) ZEC:1-None-1 load: bad record for oid 0000000000001c8c at position 2082264 in cac he file 1 This problem also occurs after a few hours. Is this caused because I'm on a MP box, with two clients, sharing the same directory for cache files?? if so, I can't see immediately how to put the files elsewhere, per Client.
(Untested):
# zope.conf.appserver1 %include zope.conf.common clienthome /path/to/zope/var/appserver1
# zope.conf.appserver2 %include zope.conf.common clienthome /path/to/zope/var/appserver2
Tres. -- =============================================================== Tres Seaver tseaver@zope.com Zope Corporation "Zope Dealers" http://www.zope.com
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
————————————————————— Russ Ferriday Solution Workshops for Plone (+44) (0) 7789 338868 http://www.solutionworkshops.com
On Fri, Jul 23, 2004 at 06:18:08PM +0100, Russ Ferriday wrote:
On 23 Jul 2004, at 17:00, Chris McDonough wrote:
Thanks a lot for the confirmation!
A pint from me the next time I see you. And Tim!
The next item is this message - I see many - and need to restart the instance that's spewing them: 2004-07-23T08:21:07 INFO(0) ZEC:1-None-1 load: bad record for oid 0000000000001c8c at position 2082264 in cac he file 1
This problem also occurs after a few hours.
Is this caused because I'm on a MP box, with two clients, sharing the same directory for cache files??
it shouldn't matter what directory as long as the filenames are unique...
if so, I can't see immediately how to put the files elsewhere, per Client.
You mean this, from zope.conf.in? # Directive: zeo-client-name # # Description: # If you want a persistent ZEO client cache which retains cache # contents across ClientStorage restarts, you need to define a # zeo-client-name. If you use ZEO and you don't set a # zeo-client-name, the client cache is stored in temporary files # which are removed when the ClientStorage shuts down. The value # of zeo-client-name is used to uniquely identify the local cache # files created if this Zope is a ZEO client. # # Default: unset # # Example: # # zeo-client-name zeo1 -- Paul Winkler http://www.slinkp.com
No, I think Tres nailed it, in his reply. I'll confirm when I know more. Thanks anyway. --r. On 23 Jul 2004, at 18:38, Paul Winkler wrote:
it shouldn't matter what directory as long as the filenames are unique...
if so, I can't see immediately how to put the files elsewhere, per Client.
You mean this, from zope.conf.in?
# Directive: zeo-client-name
————————————————————— Russ Ferriday Solution Workshops for Plone (+44) (0) 7789 338868 http://www.solutionworkshops.com
On Fri, 2004-07-23 at 13:18, Russ Ferriday wrote:
On 23 Jul 2004, at 17:00, Chris McDonough wrote:
Thanks a lot for the confirmation!
A pint from me the next time I see you. And Tim!
The next item is this message - I see many - and need to restart theinstance that's spewing them: 2004-07-23T08:21:07 INFO(0) ZEC:1-None-1 load: bad record for oid0000000000001c8c at position 2082264 in cac he file 1
This problem also occurs after a few hours.
Is this caused because I'm on a MP box, with two clients, sharing thesame directory for cache files??
No, I think it's a real bug in ZEO, introduced by this checkin: <http://cvs.zope.org/Zope/lib/python/ZEO/Attic/ClientCache.py.diff?r1=1.47.4.1&r2=1.47.4.2&only_with_tag=Zope-2_7-branch> ... but to my surprise, it's also fixed by the next checkin! which appears to have been included in Zope 2.7.2: <http://cvs.zope.org/Zope/lib/python/ZEO/Attic/ClientCache.py.diff?r1=1.47.4.2&r2=1.47.4.3&only_with_tag=Zope-2_7-branch> So the easiest "fix" is probably to upgrade to Zope 2.7.2. There really wouldn't have been any way to know this unless you had your eyes peeled on the checkins list because the CHANGES.txt just says: - ZEO/ClientStorage: fixed check for temporary cache files (patch by Dieter Maurer) ... which is uselessly vague. HTH, - C
if so, I can't see immediately how to put the files elsewhere, perClient.
--r.
————————————————————— Russ Ferriday Solution Workshops for Plone (+44) (0) 7789 338868 http://www.solutionworkshops.com
[Chris McDonough]
... self._f[current] = open(self._p[current],'w+b')
.... will be likely to fail at the last line if you're using nonpersistent cache files, because self._p[current] is (bogus) '1-None-0' (relative bogus filename).
Is it really *likely* to fail? It's just a name, and it's opened in 'w' ('+b') mode, not 'r' mode. That is, it creates the file -- no file of that name need already exist (and if one does, it tries to overrwrite it). Running on Windows most days, I'm not usually aware of all the permission bugs Linuxheads delight in torturing themselves with <wink>.
There should probably be a _using_persistent_cache flag attr rather than trying to inspect self._p to find out if we're using persistent caches.
+1. As you later discovered, this "hmm, let's try to guess what we're doing based on obscure droppings" business is a continuing bug factory.
I may try to work up a patch + test for this later.
I'm neutral on whether you try, but +1 on you actually doing it <wink>.
On Fri, 2004-07-23 at 16:21, Tim Peters wrote:
[Chris McDonough]
... self._f[current] = open(self._p[current],'w+b')
.... will be likely to fail at the last line if you're using nonpersistent cache files, because self._p[current] is (bogus) '1-None-0' (relative bogus filename).
Is it really *likely* to fail?
I suppose it depends on the working directory of the shell/process used to start Zope. Zope doesn't mess with the working directory on its own, AFAIK. If you follow Richard Stevens' ("UNIX Network Programming" guy, apparently now dead) advice, he says that "well-behaved" daemon processes should change their working directory to "/". So I suspect there are daemonizers that do this. Guido's zdrun daemon (which "zopectl" uses) gives you an option to set the working directory of the daemonized process, but I don't use it (neither zdrun nor the option, that is). It does nothing to the working directory by default. But I think the common case is that the program is run out of an /etc/init.d "rc" script, and I suspect the working directory is "/" when Zope gets started in that circumstance. Which I guess makes the error understandable.
It's just a name, and it's opened in 'w' ('+b') mode, not 'r' mode. That is, it creates the file -- no file of that name need already exist (and if one does, it tries to overrwrite it). Running on Windows most days, I'm not usually aware of all the permission bugs Linuxheads delight in torturing themselves with <wink>.
Yes. Gotta agree with you there. I don't think a day passes where I don't want to rip the face off the guy who proclaimed that TCP ports below 1024 couldn't be bound to by a user other than root. What a disaster.
There should probably be a _using_persistent_cache flag attr rather than trying to inspect self._p to find out if we're using persistent caches.
+1. As you later discovered, this "hmm, let's try to guess what we're doing based on obscure droppings" business is a continuing bug factory.
Thankfully, Dieter fixed it so it doesn't (at least in this one case).
I may try to work up a patch + test for this later.
I'm neutral on whether you try, but +1 on you actually doing it <wink>.
Too late! It's already fixed. I didn't know either. ;-) This thread was full of sound and fury.... - C
Well, I can understand Tim's comment about droppings. This configuration has made a mockery of my weekend. >:-( Looking on the bright side - it'll soon be Monday ;) First an interesting data point. Assume for now that I'm not naming my zeo-clients - this would be begging for disaster - but all is not lost. For some arcane reason I have not yet tracked down, after running my 2 clients under load, I get these... -rw-r--r-- 1 plone staff 10006534 25 Jul 05:30 /usr/local/zope/instances/prodZope/etc/1-None-0 -rw-r--r-- 1 plone staff 1404198 25 Jul 05:37 /usr/local/zope/instances/prodZope/etc/1-None-1 -rw-r--r-- 1 plone staff 5723516 25 Jul 05:02 /usr/local/zope/instances/prodZope/log/1-None-0 -rw-r--r-- 1 plone staff 10043763 25 Jul 04:20 /usr/local/zope/instances/prodZope/log/1-None-1 Nowhere have I asked it for cache files in either etc or log, but it sure makes sense that zope conspires to spread them around like this. Next for a little scat perusal of my own. We're talking 2.7.2 now. Here's the mystery. I have never seen a cache file, in any of the recent 3 Zope versions I've just tested, with a name other than *-None-*. Has anyone else? Note as well for anyone following on the Multi-Processor path, that Tres's tip about clienthome was slightly off, and yes, Paul Winkler was nearer the mark. Tres wrote..
# zope.conf.appserver1 %include zope.conf.common clienthome /path/to/zope/var/appserver1
Clienthome has no bearing on the cache file location. (AFAICT) On 23 Jul 2004, at 18:38, Paul Winkler wrote:
You mean this, from zope.conf.in?
# Directive: zeo-client-name
Yes, Paul. That's the way to go. And it should be pretty easy. Add a single line to each conf file and restart. Not so. I never saw the names of my cache files change. And they stayed in etc, and log sub dirs. I walk through it here... Starting in /ClientCache.py", line 142, at my pdb_trace() I believe that the client param should be the value of zeo-client-name from my .conf file. But I'm seeing 'None'. So we never even try to make anything by temporary caches. Here's my stack: File "/usr/local/zope/zope/lib/python/Zope/Startup/run.py", line 50, in ? run() File "/usr/local/zope/zope/lib/python/Zope/Startup/run.py", line 19, in run start_zope(opts.configroot) File "/usr/local/zope/zope272/lib/python/Zope/Startup/__init__.py", line 51, in start_zope starter.startZope() File "/usr/local/zope/zope272/lib/python/Zope/Startup/__init__.py", line 230, in startZope Zope.startup() File "/usr/local/zope/zope272/lib/python/Zope/__init__.py", line 47, in startup _startup() File "/usr/local/zope/zope272/lib/python/Zope/App/startup.py", line 57, in startup DB = configuration.dbtab.getDatabase('/', is_root=1) File "/usr/local/zope/zope272/lib/python/DBTab/DBTab.py", line 96, in getDatabase db = self._createDatabase(name, is_root) File "/usr/local/zope/zope272/lib/python/DBTab/DBTab.py", line 113, in _createDatabase db = factory.open() File "/usr/local/zope/zope272/lib/python/Zope/Startup/datatypes.py", line 172, in open DB = self.createDB() File "/usr/local/zope/zope272/lib/python/Zope/Startup/datatypes.py", line 169, in createDB return ZODBDatabase.open(self) File "/usr/local/zope/zope272/lib/python/ZODB/config.py", line 97, in open return ZODB.DB(section.storage.open(), File "/usr/local/zope/zope272/lib/python/ZODB/config.py", line 148, in open read_only_fallback=self.config.read_only_fallback) File "/usr/local/zope/zope272/lib/python/ZEO/ClientStorage.py", line 302, in __init__ client=client, var=var) File "/usr/local/zope/zope/lib/python/ZEO/ClientCache.py"(142)__init__() -> import pdb; pdb.set_trace(); up to /usr/local/zope/zope272/lib/python/ZEO/ClientStorage.py(302)__init__() where I see formal param of client=None, and useful confirmation that client is "A name used to construct persistent cache filenames. Defaults to None" There's no assignment to client in __init__(). So why is a client name not being passed in? up to /usr/local/zope/zope272/lib/python/ZODB/config.py(148)open() where I see client=self.config.client being passed to the constructor of class ZEOClient(BaseConfig) and do : self.config.__dict__ {'read_only': False, 'realm': None, 'name': 'zeostorage', 'storage': '1', '_matcher': <SectionMatcher for type 'zeoclient'>, 'min_disconnect_poll': 5, 'server': [<ZConfig.datatypes.SocketAddress instance at 0x20bdbc0>], 'cache_size': 20000000, 'client': None, 'var': '/usr/local/zope/instances/zeo/var', 'max_disconnect_poll': 300, '_name': None, 'read_only_fallback': False, 'wait': True} All fine, except 'client': None. Interpretation gets a bit messy from there, so I switch to a top-down approach: Starting from my config file: grep zeo-client-name zope_1.conf # Directive: zeo-client-name # zeo-client-name. If you use ZEO and you don't set a # zeo-client-name, the client cache is stored in temporary files # of zeo-client-name is used to uniquely identify the local cache # zeo-client-name zeo1 zeo-client-name zeo1 Fair enough. I'm setting a name. Here are my other critical items: zodb_db temporary> <temporarystorage> name sessions </temporarystorage> mount-point /temp_folder container-class Products.TemporaryFolder.TemporaryContainer </zodb_db> <zodb_db main> mount-point / <zeoclient> server localhost:8800 storage 1 name zeostorage var $ZEO_INSTANCE/var </zeoclient> </zodb_db> So I cd to the zope root and.. grep -r zeo-client-name * lib/python/Zope/Startup/zopeschema.xml: <key name="zeo-client-name" handler="zeo_client_name"> So zeo-client-name is handled by zeo_client_name. Great! I shoved a pdb_trace there. n and l give me 76 def zeo_client_name(value): 77 value and _setenv('ZEO_CLIENT', value) 78 import pdb; pdb.set_trace() 79 -> return value (Pdb) value 'zeo1' Cool! So far so good. But this brings me to the gap.. How is ZEO_CLIENT in the environment going to get into self.config.client in the ZEOClient? Bearing in mind grep -r ZEO_CLIENT * shows me only comments, apart from the handler, where it is set. Any ideas? Anyone seen anything but *-None-* for cache files? Meanwhile, I'm going to see why zope gets creative about spreading the cache files around. --r. On 23 Jul 2004, at 21:37, Chris McDonough wrote:
On Fri, 2004-07-23 at 16:21, Tim Peters wrote:
[Chris McDonough]
... self._f[current] = open(self._p[current],'w+b')
.... will be likely to fail at the last line if you're using nonpersistent cache files, because self._p[current] is (bogus) '1-None-0' (relative bogus filename).
Is it really *likely* to fail?
I suppose it depends on the working directory of the shell/process used to start Zope. Zope doesn't mess with the working directory on its own, AFAIK.
If you follow Richard Stevens' ("UNIX Network Programming" guy, apparently now dead) advice, he says that "well-behaved" daemon processes should change their working directory to "/". So I suspect there are daemonizers that do this.
Guido's zdrun daemon (which "zopectl" uses) gives you an option to set the working directory of the daemonized process, but I don't use it (neither zdrun nor the option, that is). It does nothing to the working directory by default.
But I think the common case is that the program is run out of an /etc/init.d "rc" script, and I suspect the working directory is "/" when Zope gets started in that circumstance. Which I guess makes the error understandable.
It's just a name, and it's opened in 'w' ('+b') mode, not 'r' mode. That is, it creates the file -- no file of that name need already exist (and if one does, it tries to overrwrite it). Running on Windows most days, I'm not usually aware of all the permission bugs Linuxheads delight in torturing themselves with <wink>.
Yes. Gotta agree with you there. I don't think a day passes where I don't want to rip the face off the guy who proclaimed that TCP ports below 1024 couldn't be bound to by a user other than root. What a disaster.
There should probably be a _using_persistent_cache flag attr rather than trying to inspect self._p to find out if we're using persistent caches.
+1. As you later discovered, this "hmm, let's try to guess what we're doing based on obscure droppings" business is a continuing bug factory.
Thankfully, Dieter fixed it so it doesn't (at least in this one case).
I may try to work up a patch + test for this later.
I'm neutral on whether you try, but +1 on you actually doing it <wink>.
Too late! It's already fixed. I didn't know either. ;-) This thread was full of sound and fury....
- C
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
————————————————————— Russ Ferriday Solution Workshops for Plone (+44) (0) 7789 338868 http://www.solutionworkshops.com
On Sun, 2004-07-25 at 12:43, Russ Ferriday wrote:
Assume for now that I'm not naming my zeo-clients - this would bebegging for disaster - but all is not lost. For some arcane reason Ihave not yet tracked down, after running my 2 clients under load, Iget these... -rw-r--r-- 1 plone staff 10006534 25 Jul 05:30/usr/local/zope/instances/prodZope/etc/1-None-0 -rw-r--r-- 1 plone staff 1404198 25 Jul 05:37/usr/local/zope/instances/prodZope/etc/1-None-1 -rw-r--r-- 1 plone staff 5723516 25 Jul 05:02/usr/local/zope/instances/prodZope/log/1-None-0 -rw-r--r-- 1 plone staff 10043763 25 Jul 04:20/usr/local/zope/instances/prodZope/log/1-None-1
That would be consistent with the bug we identified (that nontemporary cache files are written even when they're not supposed to be in various places on the filesystem) if your working directory was "/usr/local/zope/instances/prodZope/log" or "/usr/local/zope/instances/prodZope/etc" at the time a Zope instance was started. You could confirm this by causing Zope to print "os.getcwd()" during its startup process. It's likely that you were just "cd'ed" into these directories when you started Zope up. And failing to name your zeo clients is hopefully not "begging for disaster".. it's an explicit choice but it shouldn't have noticeably negative effect other than perhaps requiring more time in general for Zope to start and more load on the ZEO server during this starting period.
Nowhere have I asked it for cache files in either etc or log, but itsure makes sense that zope conspires to spread them around like this.
It doesn't make any sense at all except if you're running under 2.7.1 and take into account the bug in ClientCache.py that causes this kind of symptom. Hopefully that's what you're seeing. If you're seeing this under 2.7.2, it could be problematic, otherwise it's just a rehash of the bug we've been rehashing for three days now that has apparently been fixed for more than a week. ;-)
Next for a little scat perusal of my own. We're talking 2.7.2 now.
.. scat perusal... I think that might belong on a different list....
Here's the mystery.
I have never seen a cache file, in any of the recent 3 Zope versionsI've just tested, with a name other than *-None-*. Has anyone else?
Yes; well, before this I hadn't but now I have, due to investigations prompted by your message! ;-)
Note as well for anyone following on the Multi-Processor path, thatTres's tip about clienthome was slightly off, and yes, Paul Winklerwas nearer the mark.
Tres wrote.. # zope.conf.appserver1 %include zope.conf.common clienthome /path/to/zope/var/appserver1
Clienthome has no bearing on the cache file location. (AFAICT)
That appears to be correct. I had always thought it was based on clienthome too, but the code says "use the 'var' path in the <zeoclient> section or use the current directory"; clienthome isn't consulted at all for this (although it maybe should be).
On 23 Jul 2004, at 18:38, Paul Winkler wrote:
You mean this, from zope.conf.in?
# Directive: zeo-client-name
Yes, Paul. That's the way to go. And it should be pretty easy. Add asingle line to each conf file and restart. Not so. I never saw the names of my cache files change. And theystayed in etc, and log sub dirs.
When you say they "stayed" there, were these file contents changed when Zope was started and left running with a persistent cache file setting? I suspect they're just leftover from the last time you started Zope but are otherwise unused (e.g. they won't get changed; they're "dead"). What happens if you delete these files and start Zope with the "zeo-client-name" set? I suspect they won't reappear.
I walk through it here...
Starting in /ClientCache.py", line 142, at my pdb_trace() I believethat the client param should be the value ofzeo-client-name from my .conf file. But I'm seeing 'None'. So wenever even try to make anything by temporary caches.
Grrr... ok, here's the deal. zeo-client-name apparently no longer has any effect on.. the ZEO client name. ;-) I have no idea when this happened; I'm sure interested parties can spelunk the CVS logs to find out. Instead, ZEO uses a "client" key in the <zeoclient> section of a <zodb_db>, e.g.: <zodb_db main> mount-point / cache-size 400000 <zeoclient> cache-size 100000000 server localhost:9999 storage 1 name zeostorage var $INSTANCE/var # here it is! client foo </zeoclient> </zodb_db> This effectively means that "zeo-client-name" is completely useless, AFACIT. It should be removed. Once you start Zope with this client key set, you will see cache files like "c1-foo-0.zec" and so on in your "var" directory.
Any ideas? Anyone seen anything but *-None-* for cache files?
If you see *-None-* for cache files and the first * doesn't start with a "c" and the second * doesn't end with a ".zec", you're staring at the symptom of the "temporary cache file settings produce real cache files" bug or another bug that has the same symptom.
Meanwhile, I'm going to see why zope gets creative about spreading thecache files around.
I think that particular mystery has been solved if you're only seeing it under 2.7.1. - C
On Sun, 2004-07-25 at 14:06, Chris McDonough wrote:
Grrr... ok, here's the deal. zeo-client-name apparently no longer has any effect on.. the ZEO client name. ;-) I have no idea when this happened; I'm sure interested parties can spelunk the CVS logs to find out.
FWIW, I have entered this in as a bug into the Zope collector at http://zope.org/Collectors/Zope/1432 . - C
Russ Ferriday wrote at 2004-7-25 17:43 +0100:
Well, I can understand Tim's comment about droppings. This configuration has made a mockery of my weekend. >:-( Looking on the bright side - it'll soon be Monday ;)
First an interesting data point.
Assume for now that I'm not naming my zeo-clients - this would be begging for disaster - but all is not lost. For some arcane reason I have not yet tracked down, after running my 2 clients under load, I get these... -rw-r--r-- 1 plone staff 10006534 25 Jul 05:30 /usr/local/zope/instances/prodZope/etc/1-None-0
The bug Chris mentioned and fixed by my patch (the "CHANGES.txt" report for which Chris thought uninformative)... -- Dieter
On Sun, 2004-07-25 at 18:15, Dieter Maurer wrote:
The bug Chris mentioned and fixed by my patch (the "CHANGES.txt" report for which Chris thought uninformative)...
The patch is appreciated though! - C
This thread's not over 'til the fat lady sings. OK, so apologies first: Paul Winkler's comment in this thread was nearer the mark than Tres Seavers. clienthome is not relevant to where caches are created, and of course I need zeo-client-home. Problem is now that zeo-client-home seems not to have the intended effect. I set zeo-client-home to zeo1, in my first conf file and restart, but never see any cache files of that name. I traced as far as the parse handler zeo_client_home. It's invoked with the correct value. But it stuffs the value into the env, and I can't see any other reference to ZEO_CLIENT pulling it back out of there again. Working up the stack trace from ClientCache.py, where the files are created (*-None-*), my client value is always None, but there's a gap in the middle that I did not bridge, around BaseConfig and ZeoConfig. Has anyone been successful naming cache files? I'm on OSX Panther 10.3. --r. I've traced through to the zeo_client_name handler, and see the name of On 23 Jul 2004, at 21:37, Chris McDonough wrote:
On Fri, 2004-07-23 at 16:21, Tim Peters wrote:
[Chris McDonough]
... self._f[current] = open(self._p[current],'w+b')
.... will be likely to fail at the last line if you're using nonpersistent cache files, because self._p[current] is (bogus) '1-None-0' (relative bogus filename).
Is it really *likely* to fail?
I suppose it depends on the working directory of the shell/process used to start Zope. Zope doesn't mess with the working directory on its own, AFAIK.
If you follow Richard Stevens' ("UNIX Network Programming" guy, apparently now dead) advice, he says that "well-behaved" daemon processes should change their working directory to "/". So I suspect there are daemonizers that do this.
Guido's zdrun daemon (which "zopectl" uses) gives you an option to set the working directory of the daemonized process, but I don't use it (neither zdrun nor the option, that is). It does nothing to the working directory by default.
But I think the common case is that the program is run out of an /etc/init.d "rc" script, and I suspect the working directory is "/" when Zope gets started in that circumstance. Which I guess makes the error understandable.
It's just a name, and it's opened in 'w' ('+b') mode, not 'r' mode. That is, it creates the file -- no file of that name need already exist (and if one does, it tries to overrwrite it). Running on Windows most days, I'm not usually aware of all the permission bugs Linuxheads delight in torturing themselves with <wink>.
Yes. Gotta agree with you there. I don't think a day passes where I don't want to rip the face off the guy who proclaimed that TCP ports below 1024 couldn't be bound to by a user other than root. What a disaster.
There should probably be a _using_persistent_cache flag attr rather than trying to inspect self._p to find out if we're using persistent caches.
+1. As you later discovered, this "hmm, let's try to guess what we're doing based on obscure droppings" business is a continuing bug factory.
Thankfully, Dieter fixed it so it doesn't (at least in this one case).
I may try to work up a patch + test for this later.
I'm neutral on whether you try, but +1 on you actually doing it <wink>.
Too late! It's already fixed. I didn't know either. ;-) This thread was full of sound and fury....
- C
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
————————————————————— Russ Ferriday Solution Workshops for Plone (+44) (0) 7789 338868 http://www.solutionworkshops.com
participants (6)
-
Chris McDonough -
Dieter Maurer -
Paul Winkler -
Russ Ferriday -
Tim Peters -
Tres Seaver