-
Notifications
You must be signed in to change notification settings - Fork 921
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
recursor: ensure on_recursor_stop
is run when running under systemd
#14979
Comments
A likely building block is making |
With the PR and
I get:
|
I am a bit hesitant to add this in 5.2.0, otoh, if I checked with the 5.2.0 state of affairs systemd does not restart rec if it is stopped with |
If the
leaving only the second timeout. Edit: There's the discussion with the rationale. So:
Edit 2: Therefore even with asynchronous Either way, async or not, there should be Edit 3: |
I guess you mean SIGTERM and SIGKILL, respectively?
Mostly historical reasons. The graceful shutdown code is relatively recent. initially I wrote it to be able to get a meaningful valgrind run (as global destructrors are called). There also might be cases where it matters that SIGTERM is an immediate exit vs a graceful shutdown which could take time (especially with the stop hook being available now). Also, just calling the graceful stop code from a signal handler is not possible, as the code itself is not async signal safe. It would require some more code than just a call. I think having a synchronous |
Yes (the
It is. My point is - OP's assumption about "immediate SIGKILL when async" is wrong (this was bug in documentation long time ago) and one way or another you can put Another matter is when to actually do the graceful exit. As it cleans up it is probably slower than immediate. So when I got nothing to clean, and have no hooks attached, I'd choose to keep current setup for faster restarts. And by faster I mean also empty What's missing more is
though I'm not sure what's the best order for this. |
OK. Having more structured reloading is one of the long term goals, but we're getting a bit off-topic now. |
I did a few tests with the PR code. With modifications to print when a signal is received.
So it looks like systemd is not waiting at all for the This is in contrast of what you said above ((as I understood it) it waits This suggest that having a synchronous (Note: and this is also as I read systemd/systemd#13284 (comment)) |
Yes (and I can't see saying something else - unless you read the version of my comment, as I did several edits to make that clearer, but might still fail at doing so). Anyway, glad you got this.
It waits
That's why I wrote it's "not worse" than SIGTERM alone.
So both the signals do the same now? I thought SIGTERM is equivalent to |
I probably understood your comment wrongly.
I do not agree. A stop hook being killed by SIGTERM while it's running might be worse than not running it at all. We rely on the default action for SIGTERM which is to terminate the process, except when running under docker (but lets ignore docker for the moment). This means that the process is stopped by the OS without any of our own code involved. A parent of the process will be reported the process was killed by a signal when it asks for the process' status. For As we do not install a handler fot SIGTERM, it's default action is the same as for a SIGKILL (which cannot be caught). So to summarize:
|
Thank you for verbose explanations. It all makes sense now. |
Short description
In the documentation, it is mentioned that
When running under systemd and stopping the service, using
systemctl stop
, the process is sent a SIGTERM, which is not a "nice" shutdown. Hence,on_recursor_stop
is not run.Usecase
It would be great if
systemctl stop
would ensure a "nice" shutdown of the recursor to prevent surprises when usingon_recursor_stop
under systemd.Description
I see two paths of accomplishing a "nice" shutdown.
Use
ExecStop
Systemd supports the
ExecStop
directive under[Service]
. One could callrec_control
with the right parameters (--socket-dir=$RUNTIMEDIR
) to stop the recursor. However, this command should block until the recursor process is stopped, otherwise systemd will send SIGKILL afterrec_control
exits. It was mentioned on IRC that this would require some finagling as this is currently not implemented inrec_control
.Use a different
KillSignal
Another option would be to implement a different signal handler for e.g. SIGUSR2 in
pdns_recursor
to request a "nice" shutdown. Systemd can send that signal whenKillSignal
is set in the[Service]
section of the unit file. This might be a less-hairy option than theExecStop
method.The text was updated successfully, but these errors were encountered: