[Zope-dev] RE: [ZODB-Dev] [Warning] Zope/ZEO clients: subprocesses can lead tonon-deterministic message loss

Dieter Maurer dieter at handshake.de
Sun Jun 27 15:18:12 EDT 2004


Tim Peters wrote at 2004-6-27 04:46 -0400:
> ...
>[Dieter]
>>   When a process forks the complete state, including file descriptors,
>>   threads and memory state is copied and the new process
>>   executes in this copied state.
>>   We now have 2 "asyncore" threads waiting for the same events.
>
>A problem is that it's *not* the case that a POSIX fork() clones all
>threads.  Only the thread calling fork() exists in the child process.
>There's a brief but clear discussion of that here:
>
>    http://www.opengroup.org/onlinepubs/009695399/functions/fork.html
>
>POSIX doesn't even have a way to *ask* that all threads be duplicated, for
>reasons explained there.
>
>Last I heard, Dieter was running LinuxThreads, which fail to meet the POSIX
>thread spec in several respects.  But, AFAICT, fork() under LinuxThreads is
>the same as POSIX in this particular respect (since threads are distinct
>processes under LinuxThreads, it would be bizarre if a fork() cloned
>multiple processes!).  I believe native Solaris threads act as Dieter
>describes, though (fork() clones all native Solaris threads).
>
>Dieter, can you clarify which OS(es) and thread package(s) you're using
>here?
> Do the things you're doing that call fork() (directly or indirectly)
>actually run from the thread running asyncore.loop()?

The problem occured in a ZEO client which called "asyncore.poll"
in the forked subprocess. This "poll" deterministically
stole ZEO server invalidation messages from the parent.

I read the Linux "fork" manual page and found:

  fork creates a child process that differs from the parent process
  only in its PID and PPID, and in the fact that resource utilizations
  are set to 0. File locks and pending signals are not inherited.

  ...

  The fork call conforms to SVr4, SVID, POSIX, X/OPEN, BSD 4.3


I concluded that if the only difference is in the PID/PPID
and resource utilizations,
there is no difference in the threads between parent and child.
This would mean that
the wide spread "asyncore.mainloop" threads could suffer
the same message loss and message duplication.

I did not observe a message loss/duplication in any
application with an "asyncore.mainloop" thread.


Maybe, the Linux "fork" manual page is only not precise with respect
to threads and the problem does not occur in applications
with a standard "asyncore.mainloop" thread.


-- 
Dieter


More information about the Zope-Dev mailing list