Well, after more poking around and experimenting, I'm able to make an strace look like what my crashing server does by following these steps... start Zope (z2.py) wait till everybody starts attach an strace to child z2 wait till strace output shows I'm in select send child z2 a SEGV strace output matches what I see in production event.log matches what I see in production Now that I've figured this out, I managed to get ulimit set up so that I will get a core file the next time this happens. Let's hang on till I see the core. I'm also going to play with getting gdb to work nicely. I'm not sure why this is failing yet. -Jon Michel Pelletier <michel@digicool.com> writes:
But fs/select.c(sys_select) (in the linux kernel) DOES return ERESTARTNOHAND of do_select returns a negative value. This is, of course, not documented in the select man page... come to think of it, it looks like the linux kernel returns ERESTARTNOHAND instead of EINTER because (quoting the source comments) "ERESTARTSYS [instead of EINTER] breaks at lest the xview clock binary, so I'm trying ERESTARTNOHAND which restart only when you want to" [block comments are mine].
Fishy.
-Michel