[Zope] zope operations atomic?

Thu, 01 Nov 2001 18:18:23 -0500

"ghost out" means "deactivate" the object, meaning essentially that we 
dump all the objects' state out of memory except what is necessary to 
load it back in from disk.

Clark OBrien wrote:
> I will convert my example to code and post the results.
> I did not understand case b). In particular what does "ghost out" mean.
> 
> -----Original Message-----
> From: Chris McDonough [mailto:chrism@digicool.com]
> Sent: Thursday, November 01, 2001 2:01 PM
> To: Clark OBrien
> Cc: 'k_vertigo@yahoo.com'; 'zope@zope.org'
> Subject: Re: [Zope] zope operations atomic?
> 
> 
> Clark OBrien wrote:
>  > It is what happens with this retrying that interests me. As the example
>  > given
>  > below shows, this retry may never succeed. Thus long running 
> operations may
>  > never complete and are effectively starved by faster running operations.
>  >
>  >
>  >
>  > Assume in Zope I have a folder structure like this.
>  >
>  > Folder-1
>  >   -Folder-2
>  >       -Folder-3
>  >          ....
>  >             ..
>  >               Folder-1000
>  >
>  >
>  > Suppose I have a script traverseFolder(root) that starts at a given 
> root and
>  > traverses sub-folders adding the attribute foo.
>  >
>  > If I continuously call traversefolder(Folder-1000) how could
>  > traverseFolder(Folder-1) ever complete. I mean, by the time
>  > traverseFolder(Folder-1) could complete traverseFolder(Folder-1000) would
>  > have already committed several times. What would happen with the request
>  > that started traverseFolder(Folder-1), would it keep being retried ad
>  > infinitum.
>  > It is basically starved out of contention by a faster running operations.
> 
> It does get interesting when you consider that the longer-running 
> transaction might always tend to lose on read conflicts because:
> 
> a) read conflicts can't be resolved.
> b) the longer-running transaction *might* "ghost out" Folder-1000
>     to conserve RAM during the traversal after the first (aborted)
>     commit.
> 
> If b) is true in your example for every run of traverseFolder(Folder-1), 
> your contention that it will never commit might prove correct.  It'd be 
> slightly interesting to try it to see what the behavior actually is. 
> Would you be willing to do so?
> 
> In any case, we have an answer to this problem if you're willing to lose 
> a bit of consistency.  CoreSessionTracking's LowConflictConnection class 
> comes in handy here.  If your application is very write-intensive and 
> you've carefully coded a massive hotspot ala your example into it, you 
> can turn off read conflicts by using this class at the expense of some 
> consistency.
> 
> With read conflicts turned off,  I'm positive your example will 
> eventually resolve itself.  Maybe it'll take a few minutes of retries, 
> but it will eventually finish. I say that because the longer running of 
> the two scripts will eventually be able to do the commit because it's 
> statistically as likely to "win" a commit as the shorter-running script 
> when there's a write conflict; it just has fewer opportunities to do so.
> 
> And actually if you used a special FooFolder instead of a Folder for 
> this demonstration, the only thing that changed was foo, and the value 
> of foo was often the same in both connections on the object upon which 
> the transaction conflicted, you could resolve most of the conflicts that 
> could potentially occur here (save for the read conflicts) by giving it 
> a _p_resolveConflict that looked something like this:
> 
> class FooFolder:
>     def _p_resolveConflict(self, old, saved, new):
>        marker = []
>        for k,v in saved.items():
>           if new.get(k, marker) != v:
>               return None
>        for k,v in new.items():
>           if saved.get(k, marker) != v:
>               return None
>        return new
> 
> 
> However, the real answer is:  Dont design your application like this if 
> you can help it.  This is not a good pattern.  It's best to avoid 
> hotspots like Folder-1000 in your example.  It's no different than 
> continually beating the snot out of a field in a row in a relational 
> database table with writes from multiple threads, where one of the 
> threads is running an overnight transaction and the others are just 
> incrementing the field every second.  You have the same consistency, 
> contention, and timeliness issues there, AFAICT, except that it's 
> expressed in terms of pages, locks, and dirty reads.  (I'm sure there's 
> an Oracle person waiting around to "WRONG!" me to death, however.  ;-)
> 
> - C
> 
> 

-- 
Chris McDonough                    Zope Corporation
http://www.zope.org             http://www.zope.com
"Killing hundreds of birds with thousands of stones"