[Zope-dev] RE: [ZODB-Dev] [Warning] Zope/ZEO clients:
subprocesses can lead tonon-deterministic message loss
Tim Peters
tim.peters at gmail.com
Sun Jun 27 17:06:52 EDT 2004
[Dieter Maurer]
> The problem occured in a ZEO client which called "asyncore.poll"
> in the forked subprocess. This "poll" deterministically
> stole ZEO server invalidation messages from the parent.
I'm sorry, but this is still too vague to guess what happened.
- Which operating system was in use?
- Which thread package?
- In the ZEO client that called fork(), did it call fork() directly, or
indirectly as the result of a system() or popen() call? Or what?
I'd like to understand a specific failure before rushing to
generalization.
- In the ZEO client that called fork() (whether directly or indirectly),
was fork called *from* the thread running ZEO's asyncore loop,
or from a different thread?
> I read the Linux "fork" manual page and found:
>
> fork creates a child process that differs from the parent process
> only in its PID and PPID, and in the fact that resource utilizations
> are set to 0. File locks and pending signals are not inherited.
>
> ...
>
> The fork call conforms to SVr4, SVID, POSIX, X/OPEN, BSD 4.3
If it conforms to POSIX (as it says it does), then fork() also has to
satisfy the huge list of requirements I referenced before:
http://www.opengroup.org/onlinepubs/009695399/functions/fork.html
That page is the current POSIX spec for fork().
> I concluded that if the only difference is in the PID/PPID
> and resource utilizations, there is no difference in the threads between parent
> and child.
Except that if you're running non-POSIX LinuxThreads, a thread *is* a
process (there's a one-to-one relationship under LinuxThreads, not the
many-to-one relationship in POSIX), in which case "no difference in
threads" is trivially true.
> This would mean that the wide spread "asyncore.mainloop" threads could suffer
> the same message loss and message duplication.
That's why all sane <wink> threading implementations do what POSIX
does on a fork(). fork() and threading don't really mix well under
POSIX either, but the "fork+exec" model for starting a new process is
an historical burden that bristles with subtle problems in a
multithreaded world; POSIX introduced posix_spawn() and posix_spawnp()
for sane(r) process creation, ironically moving closer to what most
non-Unix systems have always done to create a new process.
> I did not observe a message loss/duplication in any
> application with an "asyncore.mainloop" thread.
I don't understand. You said that you *have* seen message
loss/duplication in a ZEO client, and I assume the ZEO client was
running an asyncore thread. If so, then you have seen
loss/duplication in an application with an asyncore thread.
Or are you saying that you haven't seen loss/duplication under the
specific Linux flavor whose man page you quoted, but have seen it
under some other (so far unidentified) system?
> Maybe, the Linux "fork" manual page is only not precise with respect
> to threads and the problem does not occur in applications
> with a standard "asyncore.mainloop" thread.
That "fork" manpage is clearly missing a mountain of crucial details
(or it's not telling the truth about being POSIX-compliant). fork()
is historically poorly documented, though.
More information about the Zope-Dev
mailing list