[Zope] Threads, ZODB, deactivate, setstate, etc...
Olivier Deckmyn
odeckmyn.list@teaser.fr
Thu, 9 Aug 2001 09:38:11 +0200
C'est un message de format MIME en plusieurs parties.
------=_NextPart_000_0016_01C120B7.01045650
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Hi there !
I am working on my product ZExternalNews
(http://www.zope.org/Members/odeckmyn/ZExternalNews).
I have a problem that I did not have anticipated !
My product is made of 3 classes:
1/ ZExternalNewsManager (usually one per site), folderish that contains
several ZExternalNewsChannel
2/ ZExternalNewsChannel (several per Manager), itemish. Contains 2
persistent properties (title and url) and a _v_items, a list of
ZExternalNewsItem
3/ ZExternalNewsItem a very simple Python class (no inheritance from zope
nor ZODB). It handles an atomic news item (like a single news item in
slashdot for example).
When ZExternalNewsManager starts, I also "fork" a thread that (simplified)
does :
while 1:
MyManager.Refresh()
sleep(MyManager.delay)
Where MyManager.Refresh is a loop that calls refresh on every channel.
This would work in a wonderful land...but not in real life. I've been
searching for hours now...
My problem is :
It works for some time (sometime 10sec, sometimes 2 hours, sometime 10
min...), and then, I got a lot of :
2001-08-07T17:36:04 ERROR(200) ZODB Couldn't load state for
'\000\000\000\000\000\000WG'
Traceback (innermost last):
File /isp/zope/newfun/lib/python/ZODB/Connection.py, line 508, in setstate
AttributeError: 'None' object has no attribute 'load'
(a lot of = one per ZExternalChannel instance)
and a last :
Exception in thread Thread-4:
Traceback (innermost last):
File "/usr/local/lib/python1.5/threading.py", line 376, in __bootstrap
self.run()
File
"/isp/zope/newfun/lib/python/Products/ZExternalNews/ZExternalNews.py", line
454, in run
self._manager.Refresh()
File "/isp/zope/newfun/lib/python/ZODB/Connection.py", line 508, in
setstate
p, serial = self._storage.load(oid, self._version)
AttributeError: 'None' object has no attribute 'load'
After hours of thinking, I think this is because ZODB ask my objects to go
to sleep (is this DEACTIVATING ?).
The last modification I made was to implement __setstate__ so that _v_*
attributes are declared there : I thought it would be ok after that. It is
not :(
Please help me ! It is quite a frustrating experience ! Do not hesitate to
Is there a document I should read before understanding ? I think I've read
quite a lot of things until now ! ;)
Attached is the latest version of the .py. Other part of the product is
downloadable here : http://www.zope.org/Members/odeckmyn/ZExternalNews
Thanx for your support.
Olivier.
(this question was posted 2 days ago on zope-dev, without answer, so I try
here ;-D )
------=_NextPart_000_0016_01C120B7.01045650
Content-Type: text/plain;
name="ZExternalNews.py"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
filename="ZExternalNews.py"
#########################################################################=
##########=0A=
## ZExternalNews.py =0A=
## (c)2001, Olivier DECKMYN=0A=
#########################################################################=
##########=0A=
=0A=
=0A=
"""This module purposes a thread-safe class to handle the management =
(load, =0A=
parse, browse, update etc...) of a syndicated news channel respecting =
RDF or =0A=
RSS format (XML based). =0A=
"""=0A=
=0A=
import os, time, string, sys=0A=
from threading import Lock, Thread=0A=
=0A=
try:=0A=
from xml.sax import saxlib, saxexts=0A=
except:=0A=
raise "ImportError", "PyXML is not installed. Install it from =
http://pyxml.sourceforge.net"=0A=
=0A=
=0A=
## Zope Products machinery imports=0A=
from Globals import HTMLFile=0A=
from Globals import MessageDialog=0A=
from Globals import Persistent=0A=
from Globals import default__class_init__=0A=
from zLOG import LOG, WARNING, INFO, ERROR, DEBUG=0A=
import Acquisition=0A=
import AccessControl.Role=0A=
import OFS.Folder=0A=
import OFS.SimpleItem=0A=
=0A=
=0A=
#########################################################################=
##########=0A=
## Utility functions=0A=
#########################################################################=
##########=0A=
=0A=
def clean_string(s):=0A=
"""Removes the too much spaces and crlf that might be in string"""=0A=
return string.join(string.split(s))=0A=
=0A=
def FormatDate(s):=0A=
"""Transforms a YYYYMMDDhhmmss timestamp string into a international =
string : YYYY/MM/DD hh:mm:ss"""=0A=
(YYYY,MM,DD,hh,mm,ss)=3D(s[0:4], s[4:6], s[6:8], s[8:10], s[10:12], =
s[12:14])=0A=
return "%s/%s/%s %s:%s:%s" % (YYYY, MM, DD, hh, mm, ss)=0A=
=0A=
=0A=
#########################################################################=
##########=0A=
## Technical classes=0A=
#########################################################################=
##########=0A=
=0A=
class MyRDFParserHandler(saxlib.HandlerBase):=0A=
"""This is a very technical class, providing delegated method to the =0A=
XML/RSS/RDF parser"""=0A=
def __init__(self, channel):=0A=
self.channel=3Dchannel # Channel we are working on=0A=
self._current_element=3D'' # 'item' , 'channel' or 'image'=0A=
self._current_property=3D'' # Current property for current element=0A=
self._current_item=3DNone # Current news item, if current element is =
an item=0A=
=0A=
def startElement(self,ele,attr):=0A=
"""Method called by the parser when a element starts"""=0A=
if ele in ['image', 'channel', 'item'] :=0A=
self._current_element=3Dele=0A=
self._current_property=3D''=0A=
if ele =3D=3D 'item' : # Starting a new item=0A=
self._current_item=3Dself.channel._new_item() =0A=
elif self._current_element=3D=3D"item" and (ele in =
ZExternalNewsItem.__Properties__) : =0A=
# Working on a sub element of item (a property of item)=0A=
self._current_property=3Dele=0A=
elif self._current_element=3D=3D"image" and ele in ["url"]:=0A=
# Working on url of image tag=0A=
self._current_property=3Dele=0A=
elif self._current_element=3D=3D"channel" and ele in ["description", =
"title", "link"]:=0A=
# Working on url of image tag=0A=
self._current_property=3Dele=0A=
elif self._current_element!=3D'':=0A=
# Working on url of image tag=0A=
self._current_property=3D"" =0A=
else:=0A=
pass=0A=
#print "NO PROPERTY HANDLER FOR ", ele=0A=
=0A=
def endElement(self,ele):=0A=
"""Method called by the parser when a element ends"""=0A=
if self._current_element=3D=3D"item" and (ele in =
ZExternalNewsItem.__Properties__):=0A=
self._current_property =3D ""=0A=
elif ele in ['image', 'channel', 'item']:=0A=
self._current_element=3D''=0A=
if ele =3D=3D 'item':=0A=
self._current_item=3DNone # end of current item=0A=
=0A=
def characters(self,ch,start,length):=0A=
"""Method called by the parser when reading text value (between =
element start and end tags)"""=0A=
if self._current_property<>'':=0A=
s=3Dch[start:start+length]=0A=
s=3Dclean_string(s)=0A=
if self._current_element=3D=3D"image" and =
self._current_property=3D=3D"url":=0A=
self.channel.image_url=3Dself.channel.image_url+s=0A=
elif self._current_element=3D=3D"channel" and =
self._current_property=3D=3D"title":=0A=
self.channel.title=3Dself.channel.title+s=0A=
elif self._current_element=3D=3D"channel" and =
self._current_property=3D=3D"description":=0A=
self.channel.description=3Dself.channel.description+s=0A=
elif self._current_element=3D=3D"channel" and =
self._current_property=3D=3D"link":=0A=
self.channel.link=3Dself.channel.link+s=0A=
elif self._current_element=3D=3D'item' and =
self._current_property!=3D'':=0A=
setattr( self._current_item, self._current_property =
,getattr(self._current_item, self._current_property)+s)=0A=
=0A=
=0A=
#########################################################################=
##########=0A=
## Classes=0A=
#########################################################################=
##########=0A=
=0A=
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - =
- - - - - =0A=
# Item Handling Class - Non Persistent (Transient)=0A=
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - =
- - - - - =0A=
=0A=
class ZExternalNewsItem:=0A=
=0A=
__allow_access_to_unprotected_subobjects__ =3D 1 # So that =
everything is accessible from Zope=0A=
=0A=
# Just to keep a list of "public" properties, to ease enumeration=0A=
__Properties__=3D['title', 'description', 'link' ] =0A=
=0A=
def __init__(self):=0A=
# self.channel=3Dchannel=0A=
self.title=3D''=0A=
self.description=3D''=0A=
self.link=3D''=0A=
=0A=
def __str__(self):=0A=
result=3D[]=0A=
for key in __Properties__:=0A=
result.append(key+"=3D"+getattr(self,key))=0A=
return string.join(result,", ")=0A=
=0A=
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - =
- - - - - =0A=
# Channel Handling Class =0A=
# Persistent=0A=
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - =
- - - - - =0A=
=0A=
manage_addZExternalNewsChannelForm =3D =
HTMLFile('www/ZExternalNewsChannelAdd', globals())=0A=
manage_editZExternalNewsChannelForm =3D =
HTMLFile('www/ZExternalNewsChannelEdit', globals())=0A=
manage_viewZExternalNewsChannel =3D =
HTMLFile('www/ZExternalNewsChannelView', globals())=0A=
=0A=
def manage_addZExternalNewsChannel(self, id, url, REQUEST=3DNone):=0A=
"Add an instance of the ZExternalNewsChannel class"=0A=
self._setObject(id, ZExternalNewsChannel(id, url, parse_now=3D0 ))=0A=
if REQUEST is not None:=0A=
return self.manage_main(self, REQUEST)=0A=
=0A=
class ZExternalNewsChannel( OFS.SimpleItem.Item,=0A=
Persistent,=0A=
Acquisition.Implicit,=0A=
AccessControl.Role.RoleManager):=0A=
"""=0A=
ZExternalNewsChannel class. =0A=
"""=0A=
=0A=
meta_type =3D 'ZExternalNewsChannel'=0A=
icon =3D 'misc_/ZExternalNews/ZExternalNewsChannelIcon'=0A=
=0A=
_properties =3D (=0A=
{'id':'url', 'type':'string'},=0A=
)=0A=
=0A=
manage_options =3D (=0A=
{'label':'View', 'action':'manage_view'},=0A=
{'label':'Edit', 'action':'manage_editForm'},=0A=
{'label':'Refresh', 'action':'manage_refresh'},=0A=
)+OFS.SimpleItem.SimpleItem.manage_options=0A=
=0A=
__ac_permissions__ =3D (=0A=
('View management screens',=0A=
('manage_tabs', 'manage_main')),=0A=
('Change Permissions',=0A=
('manage_access',)),=0A=
('View ZExternalNewsChannels',=0A=
('',),('Anonymous', 'Manager')),=0A=
)=0A=
=0A=
manage_view=3Dmanage_viewZExternalNewsChannel=0A=
manage_editForm =3D manage_editZExternalNewsChannelForm=0A=
=0A=
def __p_deactivate__(self):=0A=
LOG('ZExternalNews', INFO, =
"NewsChannel[deactivate],",str(self.id))=0A=
Persistent.__p_deactive__(self)=0A=
=0A=
=0A=
def __init__(self, id, url, name=3D"", min_delay=3D10, =
parse_now=3D1):=0A=
"""Channel Constructor"""=0A=
self.id=3Did=0A=
self.url=3Durl=0A=
self.name=3Dname=0A=
=0A=
# Protected variables=0A=
self._min_delay=3Dmin_delay=0A=
self._timestamp=3D 0=0A=
=0A=
=0A=
# Properties=0A=
self.image_url=3D''=0A=
self.title=3D''=0A=
self.description=3D''=0A=
self.link=3D''=0A=
=0A=
if parse_now=3D=3D1: =0A=
self.download_and_parse()=0A=
=0A=
def __setstate__(self, state):=0A=
LOG('ZExternalNews', INFO, "NewsChannel[setstate],",str(self.id))=0A=
Persistent.__setstate__(self, state)=0A=
self._v_items=3D[] # list of news items, could be removed from here as =
it is volatile=0A=
=0A=
=0A=
def manage_refresh(self, REQUEST=3DNone):=0A=
"""Refresh and return a message dialog"""=0A=
self.refresh(force=3D1)=0A=
if REQUEST is not None:=0A=
return MessageDialog(=0A=
title=3D'Updated',=0A=
message=3D"Channel %s has been refreshed (forced)." % =
self.name,=0A=
action =3D "./manage_main"=0A=
)=0A=
=0A=
def refresh(self, force=3D0):=0A=
"""If our channel is out of date (time spent > freq parameter), =
the channel is destroyed, loaded and parsed again"""=0A=
if (force=3D=3D1) or =
(time.time()-self._timestamp>=3Dself._min_delay*60):=0A=
self.download_and_parse()=0A=
=0A=
def download_and_parse(self):=0A=
"""Destroy, Download and Parses channel"""=0A=
=0A=
self.image_url=3D''=0A=
self.title=3D''=0A=
self.description=3D''=0A=
self.link=3D''=0A=
=0A=
xmlp=3Dsaxexts.make_parser() # Prepare parser=0A=
dh=3DMyRDFParserHandler(self) # Prepare Document Handler=0A=
xmlp.setDocumentHandler(dh) =0A=
self.clear_items() # Clear actual items list=0A=
=0A=
xmlp.parse(self.url) # Make the job (launches parsing + =
document handling)=0A=
self._timestamp=3Dtime.time() # Store TimeStamp for refresh=0A=
=0A=
=0A=
def _new_item(self):=0A=
"""Make and return a new item. This news item is owned by the =
channel"""=0A=
n=3DZExternalNewsItem()=0A=
self.getItems().append(n)=0A=
return n=0A=
=0A=
def clear_items(self):=0A=
"""Removes all items of the channel"""=0A=
self._v_items=3D[] =0A=
=0A=
def asDirtyHTML(self):=0A=
"""Renders a channel in a dirty html way. Made to ease debug =
only."""=0A=
str=3D[]=0A=
# First, a small table to display both image and title=0A=
str.append('<TABLE BORDER=3D"0"><TR>')=0A=
if string.strip(self.image_url)!=3D'':=0A=
str.append('<TD><A HREF=3D"%s"><IMG BORDER=3D"0" =
SRC=3D"%s"></A></TD>' % (self.link, self.image_url))=0A=
str.append('<TD><B>%s</B><BR>' % ( self.title))=0A=
str.append('<I>%s</I></TD>' % ( self.description) )=0A=
str.append('</TR></TABLE BORDER=3D"0">')=0A=
str.append('<HR>')=0A=
# Then, then news items, simply listed with a link to external =
provider site.=0A=
str.append('<UL>')=0A=
for i in self.getItems():=0A=
str.append('<LI><A HREF=3D"%s">%s</A>' % ( i.link,i.title) )=0A=
if i.description!=3D'':=0A=
str.append(" <i>%s</i>" % ( i.description ) )=0A=
str.append('</LI>')=0A=
str.append('</UL>')=0A=
str.append('<SMALL>Last updated %s</SMALL>' % ( =
FormatDate(self.getLastUpdateDateTime()) ))=0A=
return string.join(str, '\n')=0A=
=0A=
def __call__(self):=0A=
"""Used when rendering object directly with dtml-var in Zope"""=0A=
return self.asDirtyHTML()=0A=
=0A=
def getLastUpdateDateTime(self):=0A=
"""Returns last update date time a string using YYYYMMDDHHMMSS"""=0A=
t=3Dtime.localtime(self._timestamp)=0A=
return time.strftime("%Y%m%d%H%M%S", t)=0A=
=0A=
def getItems(self):=0A=
"""As _v_items is volatile, one need to protect the use of this =
variable through this method.=0A=
This methods ensure that the volatile attribute is created again when =
the instance re-live =0A=
(after a Zope restart or a cache sweep for example). =0A=
NOTE : This should not happen anymore with setstate =
implemented"""=0A=
#if not hasattr(self, '_v_items'):=0A=
# self._timestamp=3D0=0A=
# self._v_items=3D[]=0A=
return self._v_items=0A=
=0A=
def index_html(self):=0A=
"""Used when viewing the object through its url, directly."""=0A=
return self.asDirtyHTML() =0A=
=0A=
def manage_edit(self, url, REQUEST=3DNone):=0A=
"Change properties for the class instance."=0A=
self.url =3D url=0A=
=0A=
if REQUEST is not None:=0A=
return MessageDialog(=0A=
title=3D'Edited',=0A=
message=3D"Properties for %s has been changed." % self.id,=0A=
action =3D "./manage_main"=0A=
)=0A=
=0A=
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - =
- - - - - =0A=
# Channels Manager Class - Persistent=0A=
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - =
- - - - - =0A=
=0A=
manage_addZExternalNewsManagerForm =3D =
HTMLFile('www/ZExternalNewsManagerAdd', globals())=0A=
=0A=
def manage_addZExternalNewsManager(self, id, title, delay, =
REQUEST=3DNone):=0A=
"Add an instance of the ZExternalNewsManager class"=0A=
self._setObject(id, ZExternalNewsManager(id, title, delay ))=0A=
if REQUEST is not None:=0A=
return self.manage_main(self, REQUEST)=0A=
=0A=
class ZExternalNewsManager( OFS.Folder.Folder,=0A=
Persistent,=0A=
Acquisition.Implicit,=0A=
AccessControl.Role.RoleManager):=0A=
"""This class handles the management of a list of channels. It is =
reponsible for =0A=
automatically updating them."""=0A=
=0A=
meta_type =3D 'ZExternalNewsManager'=0A=
icon =3D 'misc_/ZExternalNews/ZExternalNewsManagerIcon'=0A=
=0A=
_properties =3D (=0A=
{'id':'title', 'type':'string'},=0A=
{'id':'delay', 'type':'int'},=0A=
)=0A=
=0A=
meta_types =3D ( # Theese are the types of objects that can be =
contained inside the container (in addition to std ones)=0A=
{=0A=
'name':'ZExternalNewsChannel', =
'action':'manage_addZExternalNewsChannelForm'},=0A=
)=0A=
=0A=
manage_addZExternalNewsChannelForm =3D =
manage_addZExternalNewsChannelForm=0A=
manage_addZExternalNewsChannel =3D manage_addZExternalNewsChannel=0A=
=0A=
manage_options =3D OFS.Folder.Folder.manage_options+(=0A=
{'label':'Restart Auto Updater', 'action':'manage_restart'},=0A=
{'label':'Refresh All Channels', 'action':'manage_refresh'},=0A=
)=0A=
=0A=
__ac_permissions__ =3D (=0A=
('View management screens',=0A=
('manage_tabs', 'manage_main')),=0A=
('Change Permissions',=0A=
('manage_access',)),=0A=
('Add ZExternalNewsChannel',=0A=
('manage_addZExternalNewsChannel', =
'manage_addZExternalNewsChannelForm')),=0A=
('View ZExternalNewsManagers',=0A=
('',),('Anonymous', 'Manager')),=0A=
)=0A=
=0A=
=0A=
def __init__(self, id, title, delay):=0A=
self.id=3Did=0A=
self.title=3Dtitle=0A=
self.delay=3Ddelay=0A=
=0A=
def __p_deactivate__(self):=0A=
LOG('ZExternalNews', INFO, =
"NewsManager[deactivate],",str(self.id))=0A=
Persistent.__p_deactive__(self)=0A=
=0A=
def __setstate__(self, state):=0A=
LOG('ZExternalNews', DEBUG, =
"NewsManager[setstate],",str(self.id))=0A=
Persistent.__setstate__(self,state)=0A=
self.auto_update() # Auto start the auto-udpdate thread=0A=
=0A=
def manage_restart(self, REQUEST=3DNone):=0A=
"""Restart the autoupdate if needed"""=0A=
if not hasattr(self, '_v_updater'):=0A=
self.auto_update()=0A=
message=3D"AutoUpdate is restarted." =0A=
else:=0A=
message=3D"AutoUpdate was already running : it was NOT restarted." =0A=
=0A=
if REQUEST is not None:=0A=
return MessageDialog(=0A=
title=3D'Restarted',=0A=
message=3Dmessage,=0A=
action =3D "./manage_main"=0A=
)=0A=
=0A=
def manage_refresh(self, REQUEST=3DNone):=0A=
"""Refresh all the channels"""=0A=
self.Refresh()=0A=
if REQUEST is not None:=0A=
return MessageDialog(=0A=
title=3D'Refresh',=0A=
message=3D"All channels were refreshed (errors, if any, are =
ignored here)." ,=0A=
action =3D "./manage_main"=0A=
)=0A=
=0A=
def auto_update(self): =0A=
if hasattr(self, '_v_updater'):=0A=
LOG('ZExternalNews', WARNING, "Restarting =
ZExternalNewsManager AutoUpdate,even if object STILL =
exist,",str(self.id))=0A=
self._v_updater=3DExternalNewsManagerUpdater(self, self.delay*60) =0A=
self._v_updater.setDaemon(1) # So that this thread will die when this =
instance will die, too.=0A=
self._v_updater.start() # Let's rock=0A=
LOG('ZExternalNews', DEBUG, "ZExternalNewsManager AutoUpdate =
started, will awake every %d minutes." % self.delay)=0A=
=0A=
def getChannelCount(self):=0A=
"""Returns number of managed channels"""=0A=
return len(self.getChannels()) =0A=
=0A=
def getChannels(self):=0A=
"""Returns the list of all channels"""=0A=
return self.objectValues(ZExternalNewsChannel.meta_type) # ZODB =
: List all ZExternalNewsChannel objects owned by the manager=0A=
=0A=
def getChannel(self, name):=0A=
"""Return channel object given its name"""=0A=
return self.getItem(name)=0A=
=0A=
def Refresh(self):=0A=
"""Refreshes all channels. In case of error on update (invalid =
url for example), the show must go on."""=0A=
LOG('ZExternalNews', DEBUG, "Refreshing all", self.id )=0A=
for channel in self.getChannels():=0A=
try:=0A=
if channel.meta_type=3D=3DZExternalNewsChannel.meta_type: # =
This is to avoid strange ZODB behaviour at Zope startup=0A=
channel.refresh()=0A=
except:=0A=
LOG('ZExternalNews', ERROR, "Problem updating =
ZExternalNewsChannel '%s' (%s: %s). Ignored." % (channel.id, =
sys.exc_info()[0], sys.exc_info()[1]) )=0A=
=0A=
def __str__(self):=0A=
return "<ExternalNewsManager with %d channel(s)>"% =
self.getChannelCount()=0A=
=0A=
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - =
- - - - - =0A=
# Manager Updater Class - Threaded - Non Persistent=0A=
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - =
- - - - - =0A=
=0A=
class ExternalNewsManagerUpdater(Thread):=0A=
"""This class is a THREAD responsible for updating in the background =
all the channel =0A=
of a given channels manager every 'delay' seconds. This is not a =
user-level class. """=0A=
=0A=
def __init__(self, manager, delay):=0A=
"""Initialize thread. Delay if in seconds. Manager is the =
attached ZExternalNewsManager instance"""=0A=
Thread.__init__(self) =0A=
self._manager=3Dmanager=0A=
self._delay=3Ddelay=0A=
=0A=
def run(self):=0A=
while 1:=0A=
"Running Update"=0A=
self._manager.Refresh()=0A=
time.sleep(self._delay)=0A=
=0A=
=0A=
default__class_init__(ZExternalNewsChannel)=0A=
default__class_init__(ZExternalNewsManager)=0A=
=0A=
------=_NextPart_000_0016_01C120B7.01045650--