[Zope-CMF] Re: cmfuid
Chris McDonough
chrism at plope.com
Tue Nov 23 16:40:57 EST 2004
On Tue, 2004-11-23 at 12:33, Gregoire Weber wrote:
> Hi Chris,
>
>
> >Here's what happens in the pathological case (broken down by "time
> >units):
> >
> >time 0: counter is at 0
> >
> >time 1: thread 1 changes the counter during a new uid call,
> > the generated uid is 1
> >
> >time 1: thread 2 changes the counter during a new uid call
> > the generated uid is 1
> >
> >(note that no conflicts have happened yet, write conflicts are only
> >raised at commit time and read conflicts don't happen because the Length
> >object is _p_independent)
> >
> >time 2: thread 1 commits
> >
> >time 3: thread 2 commits, but commit generates a write conflict
> > due to thread 1 being generated beforehand. The write
> > conflict for the counter is resolved and it is set to 2.
> >
> >Note that the counter is indeed correct (it's now 2, the next uid handed
> >out will be 3) but we've handed out the uid "1" twice. We resolved the
> >write conflict on the Length object at commit time but it didn't help
> >us. Both threads committed after giving two different callers the same
> >uid, so we presumably now have two objects with the same "uid" value,
> >which is not desirable.
>
> Oh dear, now I see it!
>
> I hoped the BTree.Length would manage everything for us ... (but I know by
> experience that when you think like this the alarm bell shall bell).
>
> So the fast solution for now is to have a hot spot (counter) ...
A simple integer counter will probably work for Zope 2.7 (at great
expense for applications that have a lot of uid-generation concurrency;
it's arguable that there are many of these, however).
But the more I think about it, I realize that the simple counter will
fail to work properly under 2.8 with MVCC. I *think* it will exhibit
the above symptom for a different reason, unless there is a facility to
override what happens when read conflicts occur (and thus MVCC kicks in)
within ZODB 3.3. I don't know that there isn't such a thing, but I've
not heard of it. AFAIK, the MVCC behavior when "resolving" a read
conflict is hardwired.
> >Making _p_independent of the Length object return false will cause a
> >read conflict to be generated at the time of uid generation (during
> >"change") if two threads ask for a uid simultaneously. This has the
> >effect at least under Zope 2.7 of causing actually unique ids to be
> >generated. This doesn't work under 2.8. Under 2.8, MVCC begins to kick
> >in and we have the same problem again even if we override
> >_p_independent.
>
> I have to think more about that later (when I have more time).
> But am I right that the current solution would be ok for Zope 2.8 with
> MVCC?
The current solution does not guarantee unique ids either Zope 2.7 or
2.8+MVCC. The "plain integer counter" will likely generate unique ids
with 2.7 but not with 2.8+MVCC! :-(
> Anyway, before the whole MVCC discussion arised I thought the
> ZODB is already MVCC capable.
Only with 2.8, which includes ZODB 3.3.
> >I think it is a real problem, but I'm not sure of the best way to fix
> >it. I'd just use a probabilistic generator but Tres doesn't like them,
> >I think. I don't yet understand why, but I'm sure there's a good reason
> >(Tres is right about 99.3% of the time ;-).
>
> Hehe!
>
>
> >No, although presumably it would be pretty simple to make it do so, I'd
> >just steal the code from Python 2.4.
>
> There will be a UUID implementation in Python 2.4?
No, but there is a "urandom" module which tries hard to provide good
sources of entropy (by using /dev/urandom under UNIX and the Windows
equivalent). Creating random UUIDs from this is very easy. I'm not
sure whether there are systems "in the wild" that do not have good
entropy sources, but for those I suppose you could generate a different
type of UUID composed of time, ip address, and so forth. It's just 128
bits of *something*.
> >> I would propose here (after having corrected the counter issue) to add
> >> a new generator tool which users can use to replace the standard one.
> >
> >I'm not sure we need two default id generators, but I'll do whatever
> >anybody wants done.
>
> Just an idea:
>
> A possible solution may be to have uids and uuids in parallel. Just
> appending every new uuid to a registry (IOBTree) and then take the index
> as a "counter uid" (for Tres and me).
Maybe, I guess it depends how complex you wanted to make the tool. I
have no intrinsic love for UUIDs. I don't need or necessarily want the
default uid tool to generate UUIDs (I can subclass the tool if I want to
do this). It's fine if it just generates integer ids if that's simplest
for everybody else.
One thing that would probably work if you wanted integer uids, you were
willing to accept the expense of keeping around every uid you'd
generated in the past and you were willing to accept non-contiguous uids
is this: you could maybe use a BTrees object (IITreeSet, maybe?) as a
"uid store". When you wanted a new uid, you'd do something like this
(where self.uids is an IITreeSet):
while 1:
uid = random.randint(0, sys.maxint-1)
if self.uids.insert(uid):
return uid
You will probably run into problems due to concurrency doing the same
with contiguous serial uids (if, for example, you used "uid =
self.uids.maxKey()+1" instead of a random.randint-generated one), as it
may exhibit the same symptom as described above. It's easiest to not
use contiguous uids, AFAICT.
> May we run into problems with conflict errors this way?
Any way you go, if you use ZODB, you will get some number of conflict
errors, which is usually ok if you can minimize them. OTOH, dealing
with conflicts is hard and the best strategy is the simplest. I think
the above "IITreeSet" strategy is probably the simplest.
OTOH, calling a "uuidgen" function is simpler still and generates no
conflicts whatsoever but requires that you trust the algorithm of the
uuidgen function to actually generate probabilistically unique ids.
Some combination of the two might be in order. Shrug. ;-)
- C
More information about the Zope-CMF
mailing list