[Zope-dev] [CRITICAL] Conflict Errors, Transactions, Retries, Oh My....

Chris McDonough chrism@zope.com
29 May 2003 00:19:30 -0400


On Wed, 2003-05-28 at 21:33, Jeffrey P Shell wrote:
> 
> Something that has happened, and is causing a small amount of alarm, is 
> that a large method that interfaces to external non-transactional 
> systems seems to (on occasion) send their information off to that 
> external system twice, but there's only one matching set of Zope data.  
> As the two writes to the non-transactional system are very close to 
> each other and contain nearly identical data (except for one bit that 
> gets regenerated in the method), and there are conflict INFO reports in 
> the Event Log from around the same time, I'm assuming that a conflict 
> error is happening somewhere in this method and causing the transaction 
> to be retried (if I'm understanding how Conflict Errors work).  Zope 
> and the relational databases seem to do things fine with rolling back 
> the data, but the non-transactional systems now have duplicate data 
> that they **absolutely should not have**.

Within Zope, when a conflict error is raised, ZPublisher catches the
exception and retries the request up to 3 times.  This is why sometimes,
for example, you'll see "double" email notifications from Wiki
subscriptions on zope.org.

> This doesn't happen often, but (as stated), this is a critical 
> operation that needs to be better protected.  All other exceptions and 
> bits and pieces in the block of code in question has been tested 
> thoroughly and we have not had any other problems that cause erroneous 
> writes.  Is there a way I can protect against Conflict Error retries as 
> well?  Is there some sort of Try/Except or Try/Finally I can wrap 
> around the code that won't interfere with the ZODB?  Is there any other 
> sort of best-practice here that could help me (and others) who might 
> unknowingly trigger this problem?

Not infallibly.   You can really never know where a ConflictError will
might be raised.  Any concurrent access to a persistent object is a
possible candidate.

> I know there are some fixes likely to be in Zope 2.6.2 that may help 
> with the situation, but I'd like to put extra protections around this 
> code regardless of what may be coming in the future. 

It will only get worse with 2.6.2: the number of conflict errors cause
by the sessioning machinery in 2.6.2 is going to go up as compared to
2.6.1 and below.  This is because the strategy used to reduce the number
of conflict errors used currently causes data desynchronization
problems.

Some folks have created products that mimic transactional semantics
(like Jens' MailDropHost) to avoid this kind of problem.  You might want
to try the same...

- C