Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recursor: ensure on_recursor_stop is run when running under systemd #14979

Open
pieterlexis opened this issue Dec 17, 2024 · 11 comments
Open

recursor: ensure on_recursor_stop is run when running under systemd #14979

pieterlexis opened this issue Dec 17, 2024 · 11 comments

Comments

@pieterlexis
Copy link
Contributor

  • Program: Recursor
  • Issue type: Feature request

Short description

In the documentation, it is mentioned that

... the Lua function on_recursor_stop on a nice shutdown (using rec_control quit-nicely of the Recursor process ...

When running under systemd and stopping the service, using systemctl stop, the process is sent a SIGTERM, which is not a "nice" shutdown. Hence, on_recursor_stop is not run.

Usecase

It would be great if systemctl stop would ensure a "nice" shutdown of the recursor to prevent surprises when using on_recursor_stop under systemd.

Description

I see two paths of accomplishing a "nice" shutdown.

Use ExecStop

Systemd supports the ExecStop directive under [Service]. One could call rec_control with the right parameters (--socket-dir=$RUNTIMEDIR) to stop the recursor. However, this command should block until the recursor process is stopped, otherwise systemd will send SIGKILL after rec_control exits. It was mentioned on IRC that this would require some finagling as this is currently not implemented in rec_control.

12:38:18      ottom │ lieter: look like the sync part can be solved but it's a bit hairy

Use a different KillSignal

Another option would be to implement a different signal handler for e.g. SIGUSR2 in pdns_recursor to request a "nice" shutdown. Systemd can send that signal when KillSignal is set in the [Service] section of the unit file. This might be a less-hairy option than the ExecStop method.

@omoerbeek
Copy link
Member

omoerbeek commented Dec 17, 2024

A likely building block is making rec_control quite-nicely wait synchronously until all recursor threads are shut down and the stop hook has run. #14976 has a Draft of that part.

@omoerbeek
Copy link
Member

omoerbeek commented Dec 18, 2024

With the PR and

[Service]
ExecStop=/usr/bin/rec_control --socket-dir=%t/pdns-recursor quit-nicely

I get:

Dec 18 13:21:33 otto-rocky9 systemd[1]: Started PowerDNS Recursor.
Dec 18 13:21:38 otto-rocky9 systemd[1]: Stopping PowerDNS Recursor...
Dec 18 13:21:38 otto-rocky9 pdns-recursor[95356]: msg="Received rec_control command via control socket" subsystem="control" level="0" prio="Info" tid="0" ts="1734528098.438" command="quit-nicely"
Dec 18 13:21:38 otto-rocky9 pdns-recursor[95356]: Exiting on user request
Dec 18 13:21:38 otto-rocky9 pdns-recursor[95368]: bye nicely
Dec 18 13:21:38 otto-rocky9 systemd[1]: pdns-recursor.service: Deactivated successfully.
Dec 18 13:21:38 otto-rocky9 systemd[1]: Stopped PowerDNS Recursor.

@omoerbeek
Copy link
Member

omoerbeek commented Dec 18, 2024

I am a bit hesitant to add this in 5.2.0, otoh, if quit-nicely does not work as expected systemd will kill the process, right?

I checked with the 5.2.0 state of affairs systemd does not restart rec if it is stopped with quit-nicely on the command line. Maybe it's best to use that for 5.2.0, and then move to the ExecStop method in 5.3?

@g0tar
Copy link
Contributor

g0tar commented Dec 18, 2024

If the ExecStop= command doesn't exit after TimeoutStopSec= (90 s by default) then systemd would send SIGTERM (or KillSignal= if set) on it's own.
Then, if any process is still alive after (another) TimeoutStopSec=, systemd would issue SIGKILL (or FinalKillSignal=).

man systemd.service says:

If no ExecStop= commands are specified, the service gets the SIGTERM immediately

leaving only the second timeout.

Edit:
The default value of TimeoutStopFailureMode= is terminate, not kill, so if rec_control is stuck, systemd will follow with TERM and then KILL. After changing to blocking, the maximum stop time would simply be doubled (first, 90 seconds for rec_control to exit and then another 90 seconds for remaining processes to exit).

There's the discussion with the rationale.

So:

  1. why isn't SIGTERM handled as "graceful" now?
  2. if rec_control will be synchronous, it will be safe to use it in ExecStop=.

Edit 2:
this is the source of misunderstanding being fixed - compare with OP's statement: "this command should block until the recursor process is stopped, otherwise systemd will send SIGKILL after rec_control exits" - this is not valid.

Therefore even with asynchronous rec_control quit-nicely there will be SIGTERM first, and SIGKILL after TimeoutStopSec=, so it will be not worse than no ExecStop= at all (directly sending SIGTERM).
The ExecStop= explanation simply means, that using asynchronous command doesn't guarantee clean stop (there's no extra waiting within systemd). But since SIGTERM isn't graceful anyway, at least the rec_control version gives recursor a chance to do something before SIGTERM.

Either way, async or not, there should be ExecStop=rec_control quit-nicely.

Edit 3:
About using different KillSignal= - the new one would replace SIGTERM and will be send immediately; then, after a TimeoutStopSec=, would be followed by SIGKILL, so there's no difference to asynchronous ExecStop=rec_control quit-nicely.

@omoerbeek
Copy link
Member

The default value of TimeoutStopFailureMode= is terminate, not kill,

I guess you mean SIGTERM and SIGKILL, respectively?

  1. why isn't SIGTERM handled as "graceful" now?

Mostly historical reasons. The graceful shutdown code is relatively recent. initially I wrote it to be able to get a meaningful valgrind run (as global destructrors are called). There also might be cases where it matters that SIGTERM is an immediate exit vs a graceful shutdown which could take time (especially with the stop hook being available now). Also, just calling the graceful stop code from a signal handler is not possible, as the code itself is not async signal safe. It would require some more code than just a call.

I think having a synchronousrec_control quit-nicely is a good thing. It should allow for typical race conditions managing rec processes (did it really stop already?) to no be an issue. Our regression test code stoppping recursors is an example that could be improved if we had a synchronous stop possibility.

@g0tar
Copy link
Contributor

g0tar commented Dec 20, 2024

I guess you mean SIGTERM and SIGKILL, respectively?

Yes (the terminate and kill are named options for that setting).

I think having a synchronousrec_control quit-nicely is a good thing.

It is. My point is - OP's assumption about "immediate SIGKILL when async" is wrong (this was bug in documentation long time ago) and one way or another you can put ExecStop=rec_control quit/-nicely into unit.
With blocking it will work better, but even non-blocking isn't worse than SIGTERM (immediate exit).

Another matter is when to actually do the graceful exit. As it cleans up it is probably slower than immediate. So when I got nothing to clean, and have no hooks attached, I'd choose to keep current setup for faster restarts. And by faster I mean also empty ExecStop= to save running rec_control binary and have systemd directly deliver SIGTERM.

What's missing more is rec_control command to reload entire config file. I've found reload-lua-config (aliased to reload-yaml in 5.2.0) didn't include my forward_zones_recurse, so probably something like:

ExecReload=rec_control reload-lua-config
ExecReload=rec_control reload-zones
ExecReload=rec_control reload-lua-script
ExecReload=rec_control acls

though I'm not sure what's the best order for this.

@omoerbeek
Copy link
Member

OK. Having more structured reloading is one of the long term goals, but we're getting a bit off-topic now.

@omoerbeek
Copy link
Member

omoerbeek commented Dec 20, 2024

I did a few tests with the PR code. With modifications to print when a signal is received.
If I modify the effect of rec_control quit-nicely to be a NOP, I see (note the reverse timeline produced by journalctl):

Dec 20 10:20:44 otto-rocky9 systemd[1]: Stopped PowerDNS Recursor.
Dec 20 10:20:44 otto-rocky9 systemd[1]: pdns-recursor.service: Failed with result 'exit-code'.
Dec 20 10:20:44 otto-rocky9 systemd[1]: pdns-recursor.service: Main process exited, code=exited, status=1/FAILURE
Dec 20 10:20:44 otto-rocky9 pdns-recursor[120460]: msg="Exiting on user request" subsystem="runtime" level="0" prio="Notice" tid="0" ts="1734690044.188"
Dec 20 10:20:44 otto-rocky9 pdns-recursor[120460]: msg="termIntHandler" subsystem="runtime" level="0" prio="Error" tid="0" ts="1734690044.188" sig="15"
Dec 20 10:20:44 otto-rocky9 pdns-recursor[120472]: Neutered bye nicely
Dec 20 10:20:44 otto-rocky9 pdns-recursor[120460]: msg="Received rec_control command via control socket" subsystem="control" level="0" prio="Info" tid="0" ts="1734690044.186" command="quit-nicely"
Dec 20 10:20:44 otto-rocky9 systemd[1]: Stopping PowerDNS Recursor...

So it looks like systemd is not waiting at all for the ExecStop to take effect, and continues by sending SIGTERM to the process (if it's still there) immediately after the exit of the ExecStop command.

This is in contrast of what you said above ((as I understood it) it waits TimeoutStopSec before sending a SIGTERM if the process is still running. Note that the SIGTERM will interrupt an ongoing stop action if quite-nicely is asynchronous.

This suggest that having a synchronous exit-nicely is needed, as @pieterlexis stated. Maybe he was wrong about the SIGKILL, but SIGTERM has the same effect as we do not install a handler for that in the systemd case.

(Note: and this is also as I read systemd/systemd#13284 (comment))

@g0tar
Copy link
Contributor

g0tar commented Dec 20, 2024

So it looks like systemd is not waiting at all for the ExecStop to take effect, and continues by sending SIGTERM to the process (if it's still there) immediately after the exit of the ExecStop command.

Yes (and I can't see saying something else - unless you read the version of my comment, as I did several edits to make that clearer, but might still fail at doing so). Anyway, glad you got this.

This is in contrast of what you said above ((as I understood it) it waits TimeoutStopSec before sending a SIGTERM if the process is still running.

It waits TimeoutStopSec if the ExecStop command is running (synchronous). After the command itself there's SIGTERM without any waiting, another TimeoutStopSec and then SIGKILL. There was the demo of this to disambiguate.

Note that the SIGTERM will interrupt an ongoing stop action if quite-nicely is asynchronous.

That's why I wrote it's "not worse" than SIGTERM alone.

This suggest that having a synchronous exit-nicely is needed, as @pieterlexis stated. Maybe he was wrong about the SIGKILL, but SIGTERM has the same effect as we do not install a handler for that in the systemd case.

So both the signals do the same now? I thought SIGTERM is equivalent to rec_control quit.
Is this a systemd exception (not installing the handler)?

@omoerbeek
Copy link
Member

omoerbeek commented Dec 20, 2024

I probably understood your comment wrongly.

That's why I wrote it's "not worse" than SIGTERM alone.

I do not agree. A stop hook being killed by SIGTERM while it's running might be worse than not running it at all.

We rely on the default action for SIGTERM which is to terminate the process, except when running under docker (but lets ignore docker for the moment). This means that the process is stopped by the OS without any of our own code involved. A parent of the process will be reported the process was killed by a signal when it asks for the process' status.

For rec_control quit we call _exit(1) which terminates the process as well, but with a different status as when being killed by a signal. A parent process will be able see the difference between the two ways.

As we do not install a handler fot SIGTERM, it's default action is the same as for a SIGKILL (which cannot be caught).

So to summarize:

  • SIGTERM and SIGKILL both terminate the process without our code being involved.
  • rec_control quit triggers _exit(1) in the recursor process.
  • rec_control quit-nicely triggers code in the recursor to request the threads to stop, wait until they do so, calls the stop hook and then the recursor exits 0. The PR changes things to make the rec_control process wait for the condition "stop hook has been run". In the current master, rec_control will exit before all the work in the recursor to stop it has been done.

@g0tar
Copy link
Contributor

g0tar commented Dec 20, 2024

Thank you for verbose explanations. It all makes sense now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants