[Zope] XRON and simultaneous triggers
Loren Stafford
lstafford@morphics.com
Thu, 21 Jun 2001 15:54:43 -0700
The Xron Dispatcher executes ready events serially. This is due to the fact
that Dispatcher consists of a single thread and fires scheduled methods with
Client.py, so it has to wait until it gets a response from Client.py before
it can service the next ready event.
So I wouldn't expect to find that Xron contributes to concurrency problems
by itself. Of course it might trigger a method that accesses your database
at the same time that a real user is accessing the database, but that should
be a trivial problem.
The Dispatcher tries to avoid endless loops when something goes wrong in a
scheduled method. First it tries to disarm the scheduled method; failing
that, it deletes the corresponding entry from the Schedule catalog.
It disarms the scheduled method by calling (again via Client.py) disarm(), a
private method of the Xron DTML Method, and rescheduling the method for
"never". Its attempt to disarm may fail, and the entries you are seeing in
the log file show that it is failing. There are only a few conditions that
can cause a failure at this point: insufficient priviledges to execute the
disarm() method, incorrect URL of the method, failure of the socket used by
Client.py, failure to update the Schedule catalog when disarming the method.
In case the Dispatcher fails to disarm a Xron DTML Method, it still tries to
prevent looping by deleting the corresponding entry from the Schedule
catalog. I think the only failure that can occur at this point is a Schedule
catalog update failure. If it can't change or delete the schedule entry from
the catalog, it will find it there again the next time it wakes up (which
will be immediately) and try to trigger it again.
I don't know why it would not be possible to update the Schedule at this
point. But I have twice seen such a failure on my system (NT 4, Zope 2.3.2).
In fact, just yesterday I received 163 email messages telling me to take out
the garbage. 8-0
If anyone has any ideas about how the Dispatcher could fail to disarm a
failed event, I'd be happy to listen. In fact, someone who is familiar with
ZODB programming should look over the ZODB and transaction logic in
Dispatcher.py and check whether I am doing something wrong there. I really
don't understand everything I wrote. 8-)
In fact, why don't I just include it below.
-- Loren
#####################################################################
""" Dispatcher for Xron Events """
#The Dispatcher runs as a separate thread
# It uses the Schedule as its primary data structure
# It knows which event is next
# It sleeps until that event or a change in the Schedule
# It fires an event and logs its output
import Loggerr
loggerr=Loggerr.loggerr
#import pdb; pdb.set_trace()
from Globals import DateTime
import sys, string
maxwait=float(10) # max time to wait between wake-ups in seconds
def Timer(ScheduleID, ScheduleChange, rpc): # aka Dispatcher
#loggerr(0, 'Dispatcher thread started.')
# infinite loop
while 1:
# Good morning. We just woke up.
# The first thing we need is a new connection.
import Zope
app=Zope.app()
try:
Schedule = getattr(app, ScheduleID, None)
except:
loggerr(301, 'Cannot access catalog. Suspending operation.')
break
interval=maxwait # Default sleep time. May be recalculated below
try:
(atime, aurl)=Schedule.armed_event() # Get next armed event
except:
loggerr(302,'Cannot access catalog. Suspending operation.')
break # out of infinite loop
if atime is None:
#loggerr(0, 'No armed events.') # debug
pass # Sleep some more
elif atime.isFuture(): # The next armed event is not yet ready
# calculate how long we have to wait
ainterval=atime.timeTime() - DateTime().timeTime()
if ainterval < float(0): ainterval=float(0) # Is negative bad?
interval=ainterval # Comment out for debugging to limit to maxwait
else: # This event is ready now.
# Fire event, and log its output
emsg= '\nTrigger event: %s\nTrigger time: %s' % (aurl, atime)
furl=string.join((aurl, 'trigger'), '/')
try:
(headers,response)=rpc(furl) # Fire event
dmsg='%s\n' % response
loggerr(0, emsg, detail=dmsg) # Log the event and its output
except:
type, val, tb = sys.exc_info()
dmsg="Failed to trigger event.\nType=%s\nVal=%s\n" % (type, val)
loggerr(100, emsg, detail=dmsg)
del type, val, tb, dmsg
try:
rpc('%s/%s' % (aurl, 'disarm')) # Attempt to disarm
loggerr(100, 'Disarmed event', detail='')
except:
# aurl is probably pointing to an event that no longer exists
# or the url doesn't resolve correctly
loggerr(100, "Failed to disarm event", detail='')
# Let's just kick it out of the catalog
# Otherwise, this event will come back to haunt us
try:
Schedule.exterminate(aurl)
get_transaction().commit()
except:
pass
# Finished processing one event
app._p_jar.sync() # see ZODB/Connection.py
interval=float(0)
# We're going to sleep now; so, free the connection
app._p_jar.close()
del app
# Sleep for predetermined interval
#emsg= 'Going to sleep for %s seconds' % (interval)
ScheduleChange.wait(interval) # in seconds (float)
if ScheduleChange.isSet():
#loggerr(0, 'Awakened by set event.') # debug
ScheduleChange.clear()
# Schedule has changed, we woke up early
# Loop back to the top and check for
# an earlier event than we were waiting for
#else:
#loggerr(0, 'Awakened by timeout.') # debug
# We timed out, so there must be a ready event
# Loop back to the top and trigger the next ready event.
# That's probably the one we were waiting for.
#pass
# End of while 1 loop.
# Something bad happened, let's clean up before quitting for good
try:
app._p_jar.close()
del app
except:
pass
loggerr(100, 'Dispatcher thread is terminating.')
============== end ================