[Zope-dev] Possible Windows Service improvements.
Mark Hammond
mhammond at skippinet.com.au
Sun Aug 8 21:10:49 EDT 2004
[Me]
> > By adding a layer around run.py, I believe we could arrange
> > for these fatal errors to be handled with a special return code.
[Jim]
> I assume by "fatal", you mean errors that we should not try to restart
> from.
Correct.
> Let me see if I understand the use cases here:
>
> - Normal shutdown. (Should it be possible to shut down Zope
> through the web on Windows?)
I see no reason it makes less sense for Windows than it does for anywhere
else <wink>.
>
> - Start-up error. We want to log relevent information somewhere.
> We don't want to restart.
>
> - Run-time (after startup) error. We also want to log a problem,
> but we do want to restart Zope.
Yep, I think that covers it.
> Note that we also need to consider uncontrolled exits, like segfaults.
Yes - if the segfault is at startup, it should be considered
non-restartable. Once normal operations have started, a segfault should
cause a restart.
> Perhaps there should be a framework that with calls that a program can
> make to indicate normal exit, fatal (non-restartable) exit,
> and non-fatal (restartable) exit.
That could done with process exit codes if all the child needs to is report
*exit* status - but that really doesn't cover enough bases. Given the
number of ways programs can fail, it may be hard to guarantee, and doesn't
handle uncontrolled exits or children going zombie.
What we need is something a more authoritative - where the child process
actively signals its state to the parent - ie "starting", "running" or
"stopping". "pausing" and "paused" may also make sense. If the child never
reported 'running', it is non-restartable. If the child terminates without
reporting graceful shutdown, it is restartable.
This still does not provide any way of handling the case when the child
process is running, but failing to transition between states. We still need
a timeout, but can make it more robust by having the child process report
the status *and* the timeout the parent should use.
Which, coincidently, sounds exactly like the Windows Services API <wink>.
Is that sounding reasonable, or moving into too-complicated/YAGNI territory?
Thanks,
Mark.
More information about the Zope-Dev
mailing list