XRON and simultaneous triggers
How does XRON handle multiple events schedule for firing at the same time? If they are parallel tasks, does this present any problems with ODBC database access if say 20 tasks were all queued to fire at the same time? I added a ODBC SQL call to my Xron Method to tag a particular record in an MSAccess database with the current date for indicating the last time a scheduled job fired. This ODBC SQL call would've been executing from multiple threads simultaneously when the events triggered through the scheduler. I'm seeing some "connection already closed" errors from zmxodbc in the xron log. This is resulting in an endless loop until the server is restarted and full system logs. 2001-06-21T18:11:51 PROBLEM(100) Products.Xron.Loggerr Failed to disarm event ------ 2001-06-21T18:11:51 PROBLEM(100) Products.Xron.Loggerr Trigger event: http://192.168.200.1:8080/Event_IConnect_Alert Trigger time: 2001/06/21 12:59:00 US/Central Failed to trigger event. Type=bci.ServerError Val=connection already closed (File: D:\Zope230\lib\python\Products\ZmxODBCDA\db.py Line: 150) 500 Internal Server Error for http://192.168.200.1:8080/Event_IConnect_Alert/trigger ------ 2001-06-21T18:11:51 PROBLEM(100) Products.Xron.Loggerr Failed to disarm event ------ 2001-06-21T18:11:51 PROBLEM(100) Products.Xron.Loggerr Trigger event: http://192.168.200.1:8080/Event_IConnect_Alert Trigger time: 2001/06/21 12:59:00 US/Central Failed to trigger event. Type=bci.ServerError Val=connection already closed (File: D:\Zope230\lib\python\Products\ZmxODBCDA\db.py Line: 150) 500 Internal Server Error for http://192.168.200.1:8080/Event_IConnect_Alert/trigger Alan Capesius, MCSE/NTCIP+20 Technical Support Engineer Sysmex Corporation of America capesiusa@sysmex.com
The Xron Dispatcher executes ready events serially. This is due to the fact that Dispatcher consists of a single thread and fires scheduled methods with Client.py, so it has to wait until it gets a response from Client.py before it can service the next ready event. So I wouldn't expect to find that Xron contributes to concurrency problems by itself. Of course it might trigger a method that accesses your database at the same time that a real user is accessing the database, but that should be a trivial problem. The Dispatcher tries to avoid endless loops when something goes wrong in a scheduled method. First it tries to disarm the scheduled method; failing that, it deletes the corresponding entry from the Schedule catalog. It disarms the scheduled method by calling (again via Client.py) disarm(), a private method of the Xron DTML Method, and rescheduling the method for "never". Its attempt to disarm may fail, and the entries you are seeing in the log file show that it is failing. There are only a few conditions that can cause a failure at this point: insufficient priviledges to execute the disarm() method, incorrect URL of the method, failure of the socket used by Client.py, failure to update the Schedule catalog when disarming the method. In case the Dispatcher fails to disarm a Xron DTML Method, it still tries to prevent looping by deleting the corresponding entry from the Schedule catalog. I think the only failure that can occur at this point is a Schedule catalog update failure. If it can't change or delete the schedule entry from the catalog, it will find it there again the next time it wakes up (which will be immediately) and try to trigger it again. I don't know why it would not be possible to update the Schedule at this point. But I have twice seen such a failure on my system (NT 4, Zope 2.3.2). In fact, just yesterday I received 163 email messages telling me to take out the garbage. 8-0 If anyone has any ideas about how the Dispatcher could fail to disarm a failed event, I'd be happy to listen. In fact, someone who is familiar with ZODB programming should look over the ZODB and transaction logic in Dispatcher.py and check whether I am doing something wrong there. I really don't understand everything I wrote. 8-) In fact, why don't I just include it below. -- Loren ##################################################################### """ Dispatcher for Xron Events """ #The Dispatcher runs as a separate thread # It uses the Schedule as its primary data structure # It knows which event is next # It sleeps until that event or a change in the Schedule # It fires an event and logs its output import Loggerr loggerr=Loggerr.loggerr #import pdb; pdb.set_trace() from Globals import DateTime import sys, string maxwait=float(10) # max time to wait between wake-ups in seconds def Timer(ScheduleID, ScheduleChange, rpc): # aka Dispatcher #loggerr(0, 'Dispatcher thread started.') # infinite loop while 1: # Good morning. We just woke up. # The first thing we need is a new connection. import Zope app=Zope.app() try: Schedule = getattr(app, ScheduleID, None) except: loggerr(301, 'Cannot access catalog. Suspending operation.') break interval=maxwait # Default sleep time. May be recalculated below try: (atime, aurl)=Schedule.armed_event() # Get next armed event except: loggerr(302,'Cannot access catalog. Suspending operation.') break # out of infinite loop if atime is None: #loggerr(0, 'No armed events.') # debug pass # Sleep some more elif atime.isFuture(): # The next armed event is not yet ready # calculate how long we have to wait ainterval=atime.timeTime() - DateTime().timeTime() if ainterval < float(0): ainterval=float(0) # Is negative bad? interval=ainterval # Comment out for debugging to limit to maxwait else: # This event is ready now. # Fire event, and log its output emsg= '\nTrigger event: %s\nTrigger time: %s' % (aurl, atime) furl=string.join((aurl, 'trigger'), '/') try: (headers,response)=rpc(furl) # Fire event dmsg='%s\n' % response loggerr(0, emsg, detail=dmsg) # Log the event and its output except: type, val, tb = sys.exc_info() dmsg="Failed to trigger event.\nType=%s\nVal=%s\n" % (type, val) loggerr(100, emsg, detail=dmsg) del type, val, tb, dmsg try: rpc('%s/%s' % (aurl, 'disarm')) # Attempt to disarm loggerr(100, 'Disarmed event', detail='') except: # aurl is probably pointing to an event that no longer exists # or the url doesn't resolve correctly loggerr(100, "Failed to disarm event", detail='') # Let's just kick it out of the catalog # Otherwise, this event will come back to haunt us try: Schedule.exterminate(aurl) get_transaction().commit() except: pass # Finished processing one event app._p_jar.sync() # see ZODB/Connection.py interval=float(0) # We're going to sleep now; so, free the connection app._p_jar.close() del app # Sleep for predetermined interval #emsg= 'Going to sleep for %s seconds' % (interval) ScheduleChange.wait(interval) # in seconds (float) if ScheduleChange.isSet(): #loggerr(0, 'Awakened by set event.') # debug ScheduleChange.clear() # Schedule has changed, we woke up early # Loop back to the top and check for # an earlier event than we were waiting for #else: #loggerr(0, 'Awakened by timeout.') # debug # We timed out, so there must be a ready event # Loop back to the top and trigger the next ready event. # That's probably the one we were waiting for. #pass # End of while 1 loop. # Something bad happened, let's clean up before quitting for good try: app._p_jar.close() del app except: pass loggerr(100, 'Dispatcher thread is terminating.') ============== end ================
participants (2)
-
Capesius, Alan -
Loren Stafford