[Zope-dev] RE: [ZODB-Dev] [Warning] Zope/ZEO clients: subprocesses
can lead tonon-deterministic message loss
Tim Peters
tim at zope.com
Fri Jun 25 16:44:16 EDT 2004
[Dieter Maurer]
> ATTENTION: Crosspost -- Reply-To set to 'zope-dev at zope.org'
Which I've honored.
> Today, I hit a nasty error.
>
> The error affects applications under Unix (and maybe Windows) which
>
> * use an "asyncore" mainloop thread (and maybe other asyncore
> applications)
>
> Zope and many ZEO clients belong to this class
Note a possible complication: ZEO monkey-patches asyncore, replacing its
loop() function with one of its own. This is done in ZODB's
ThreadedAsync/LoopCallback.py.
> and
>
> * create subprocesses (via "fork" and "system", "popen" or friends if
> they use "fork" internally (they do under Unix but I think not
> under Windows)).
It may be an issue under Cygwin, but not under native Windows, which
supports no way to clone a process; file descriptors may get inherited by
child processes on Windows, but no code runs by magic.
> The error can cause non-deterministic loss of messages (HTTP requests,
> ZEO server responses, ...) destined for the parent process. It also can
> cause the same output to be send several times over sockets.
>
> The error is explained as follows:
>
> "asyncore" maintains a map from file descriptors to handlers.
> The "asyncore" main loop waits for any file descriptor to
> become "active" and then calls the corresponding handler.
There's a key related point, though: asyncore.loop() terminates if it sees
that the map has become empty. This appears to have consequences for the
correctness of workarounds. For example, this is Python's current asyncore
loop (the monkey-patched one ZEO installs is similar in this respect):
def loop(timeout=30.0, use_poll=False, map=None):
if map is None:
map = socket_map
if use_poll and hasattr(select, 'poll'):
poll_fun = poll2
else:
poll_fun = poll
while map:
poll_fun(timeout, map)
If map becomes empty, loop() exits.
> When a process forks the complete state, including file descriptors,
> threads and memory state is copied and the new process
> executes in this copied state.
> We now have 2 "asyncore" threads waiting for the same events.
Sam Rushing created asyncore as an alternative to threaded approaches;
mixing asyncore with threads is a nightmare; throwing forks into the pot too
is a good working definition of hell <wink>.
> File descriptors are shared between parent and child.
> When the child reads from a file descriptor from its parent,
> it steals the corresponding message: the message will
> not reach the parent.
>
> While file descriptors are shared, memory state is separate.
> Therefore, pending writes can be performed by both
> parent and child -- leading to duplicate writes to the same
> file descriptor.
>
>
> A workaround it to deactivate "asyncore" before forking (or "system",
> "popen", ...) and reactivate it afterwards: as exemplified in the
> following code:
>
> from asyncore import socket_map
> saved_socket_map = socket_map.copy()
> socket_map.clear() # deactivate "asyncore"
As noted above, this may (or may not) cause asyncore.loop() to plain stop,
in parent and/or in child process. If there aren't multiple threads, it's
safe, but presumably you have multiple threads in mind, in which case
behavior seems unpredictable (will the parent process's thread running
asyncore.loop() notice that the map has become empty before the code below
populates the map again? asyncore.loop() will or won't stop in the parent
depending on that timing accident).
> pid = None
> try:
> pid = fork()
> if (pid == 0):
> # child
> # ...
> finally:
> if pid != 0:
> socket_map.update(saved_socket_map) # reactivate "asyncore"
Another approach I've seen is to skip mucking with socket_map directly, and
call asyncore.close_all() first thing in the child process. Of course
that's vulnerable to vagaries of thread scheduling too, if asyncore is
running in a thread other than the one doing the fork() call.
More information about the Zope-Dev
mailing list