Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/espanso: add required capabilities for wayland #339594

Closed
wants to merge 1 commit into from

Conversation

n8henrie
Copy link
Contributor

@n8henrie n8henrie commented Sep 4, 2024

On Wayland, Espanso depends on cap_dac_override+p for the EVDEV
backend. Specifically, this capability is required by the worker
thread, which is forked from the main espanso process when run by the
usual means (espanso start or espanso daemon).

Espanso (responsibly) drops capabilities as soon as possible, prior
to forking the worker process. Unfortunately, security.wrappers sets
the capabilities in such a way that the forked process does not pick
up these capabilities (due to /proc/self/exe pointing to the original
espanso binary, which does not have these capabilities).

By running worker directly from the capability-enabled wrapper,
the worker thread is able to run without issue, and Espanso runs as
expected on wayland.

Description of changes

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.11 Release Notes (or backporting 23.11 and 24.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

On Wayland, Espanso depends on `cap_dac_override+p` for the EVDEV
backend. Specifically, this capability is required by the `worker`
thread, which is forked from the main espanso process when run by the
usual means (`espanso start` or `espanso daemon`).

Espanso (responsibly) drops capabilities as soon as possible, prior
to forking the worker process. Unfortunately, `security.wrappers` sets
the capabilities in such a way that the forked process does not pick
up these capabilities (due to `/proc/self/exe` pointing to the original
espanso binary, which does *not* have these capabilities).

By running `worker` directly from the capability-enabled wrapper,
the worker thread is able to run without issue, and Espanso runs as
expected on wayland.

- NixOS#249364
- NixOS#328890
- https://espanso.org/docs/install/linux

- fixes NixOS#249364
@github-actions github-actions bot added 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` labels Sep 4, 2024
@n8henrie n8henrie requested review from pitkling and numkem September 4, 2024 17:26
@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10 labels Sep 4, 2024
@n8henrie
Copy link
Contributor Author

n8henrie commented Sep 4, 2024

@pitkling added some fair criticisms of this approach here

@pitkling
Copy link
Member

pitkling commented Sep 5, 2024

Continuing the discussion from #249364 (especially this comment) here, to avoid spamming that issue unnecessarily.

First of all, thanks for digging further into this approach! It's definitely worth pursuing and might be a good alternative to PR #328890.

From your issue description, I get it that the CAP_DAC_OVERRIDE requirement of espanso is a wayland-only thing? Is that documented somewhere? Just asking, since in that case I'll update my draft PR #328890 accordingly.

(quotes from #249364, this comment)

[…]

This means that the espanso team could remove/change/substitute/rename/… that subcommand any time

True, but the nix input won't change until we change it. I've been trying to help the espanso team here and there; there's been talk about a big-ish refactor of the GUI to remove the wx dependency, but I haven't heard of any immediate plans for a major refactoring of the backend. (EDIT: espanso roadmap)

True enough, there won't be problems until we update the derivation. I'm mostly worried about upstream changes in the backend that we are unaware of and that do not cause build problems. E.g., the espanso team might keep the worker subcommand but add additional features relying on some monitor process managing the worker. We would not spot those problems on a derivation update but they could cause problems much later that might be difficult to figure out.

automatically restarting the worker if the configuration changed

Good point, I need to double check this. Requiring manual reload to pick up config changes would be a bit of a bummer (since they've already done the work to make this smoother).

The systemd unit should be automatically restarting on failure, which might obviate the need for some of their checks.

Good point, that definitely helps. Still, there must be more to it, right? Otherwise, why would the espanso team not make use of this themselves in their reference systemd service?

Even if it works now, it might break future features if some internals change.

That's of course possible, though I'm not sure what approach could prevent upstream changes from breaking the always-unconventional nix way. […]

That's true, it's not possible to avoid such issues. I'm just worried that this approach is more prone to these problems since (a) it relies on the seemingly not-well documented workings of an internal subcommand and (b) we might not even spot problems until much later when updating the derivation (as I mentioned above).

It seems to me that the approach via the wrapper library from #328890 (which definitely has its own problems!) has the benefit on not relying on internals (at least not much; I don't think it's likely that the espanso team changes process forking in a way that is not captures by the wrapper library) and (b) if there are upstream changes breaking that approach, we should realize this very quickly (not necessarily at build time, but the process will most likely not work at all, as it is currently the case at least on wayland).


Btw., I will sometime soon (currently somewhat busy) create a issue for discussing whether such problems can be handled directly by security.wrappers, as you suggested here. But I think we should implement a fix for #249364 (whether this PR or #328890) anyways, since I think even if security.wrappers can handle this, it will take some time to implement such a change.

@n8henrie
Copy link
Contributor Author

n8henrie commented Sep 5, 2024

From your issue description, I get it that the CAP_DAC_OVERRIDE requirement of espanso is a wayland-only thing? Is that documented somewhere? Just asking, since in that case I'll update my draft PR #328890 accordingly.

I think so -- I linked to the linux installation instructions above, capabilities are only mentioned in the wayland section.

I'm mostly worried about upstream changes in the backend that we are unaware of and that do not cause build problems.

Certainly possible, but I think you might be over-worried here. I am involved in espanso development and intend to become more involved over time. Similarly, I'm listed as a maintainer of the espanso package and module; with any luck I'll be able to help watch for issues here going forward.

Can you elaborate on a worst-case scenario in your mind? I want to make sure we're speaking the same language; in my mind, you're suggesting that:

  1. an upstream minor or patch-level change could reasonably change the behavior of espanso worker, since it's not part of their public API
  2. if this were the case, we might update the src and do some basic testing* and find that espanso still seems to work and update the derivation
  3. at some point, someone would discover an edge case in which something wasn't working and file a bug report
  4. in that case, we would likely have to revert the PR updating src -- fairly trivial to do in nixos -- and then consider other options (perhaps security.wrappers will have an option to intercept /proc/self/exe calls by then).

Is that more-or-less the risks that concern you? As a hobbyist self-taught programmer, sometimes I am completely blind to the concerns that professionals assume I am taking into account.

* perhaps we could consider adding a nixos test for espanso?

Still, there must be more to it, right? Otherwise, why would the espanso team not make use of this themselves in their reference systemd service?

I might be misunderstanding -- their reference service does leverage systemd's automatic service restarts: Restart=on-failure, which is what I was referring to.

Also -- for context -- espanso's creator and primary dev @federico-terzi has done an amazing job but has openly noted that he is not going to be able to focus on espanso for the time being (espanso/espanso#1742); @AucaCoyan and others have stepped up, but issues like espanso/espanso#1953, espanso/espanso#1963, and many others show that there is still a lot of work to be done to even get development back on track. I only bring this up to point out that they are an extremely welcoming community that is hungry for additional contributors; I still think that an upstream patch should be considered if we can come up with a strategy that would ameliorate nix issues and be a win (or net even) for the espanso team. They've showed a lot of flexibility working with 3rd party package management in the past (espanso/espanso#615) and are generally very nice people.

@pitkling
Copy link
Member

pitkling commented Sep 6, 2024

From your issue description, I get it that the CAP_DAC_OVERRIDE requirement of espanso is a wayland-only thing? Is that documented somewhere? Just asking, since in that case I'll update my draft PR #328890 accordingly.

I think so -- I linked to the linux installation instructions above, capabilities are only mentioned in the wayland section.

Ahh, you're right. I'll update my PR draft accordingly.

I'm mostly worried about upstream changes in the backend that we are unaware of and that do not cause build problems.

[..] Can you elaborate on a worst-case scenario in your mind? I want to make sure we're speaking the same language; in my mind, you're suggesting that:

  1. an upstream minor or patch-level change could reasonably change the behavior of espanso worker, since it's not part of their public API
  2. if this were the case, we might update the src and do some basic testing* and find that espanso still seems to work and update the derivation
  3. at some point, someone would discover an edge case in which something wasn't working and file a bug report
  4. in that case, we would likely have to revert the PR updating src -- fairly trivial to do in nixos -- and then consider other options (perhaps security.wrappers will have an option to intercept /proc/self/exe calls by then).

Is that more-or-less the risks that concern you? […]

Yes, that's basically it. In general I think it's more robust to stick to openly documented interfaces, to keep maintenance low. That's also why I still think the approach from #328890 requires probably lower maintenance of the NixOS module code (major concern for #328890) than the approach here. But given that you're involved in espanso's development definitely weakens those arguments.

Still, there must be more to it, right? Otherwise, why would the espanso team not make use of this themselves in their reference systemd service?

I might be misunderstanding -- their reference service does leverage systemd's automatic service restarts: Restart=on-failure, which is what I was referring to. […]

Yes, but they're running espanso launcher, not espanso worker. I just wanted to point out that this is another indicator that the launcher does more than simply restarting espanso on failures.

@n8henrie
Copy link
Contributor Author

n8henrie commented Sep 6, 2024

Ahh, you're right. I'll update my PR draft accordingly.

Trying to test to confirm (on an Arch host running i3 over VNC, with espanso built from source), espanso runs and seems to try to expand without any additional capabilities. Unfortunately it succeeds at deleting the trigger text but the replacement text doesn't appear; I suspect this is an issue with the clipboard and VNC. It is definitely different than the "crash at launch" that I get on Wayland when trying to run without capabilities.

Also, regarding your PR -- IMO users installing espanso with package = pkgs.espanso-wayland should -- with documentation -- be opting into the wrapper by default. If espanso cannot work on wayland without these capabilities, and we're not going to provide a service that runs as root by default, I think that choice should be enough for the rest to configure itself to be usable. What are your thoughts on simplifying some of your PR to make the module "just work" based on the user's choice of package?

That's also why I still think the approach from #328890 requires probably lower maintenance of the NixOS module code (major concern for #328890) than the approach here.

Did you mean to put the same PR for both of those links? Assuming so -- that's interesting, I still worry that the complexity of that approach would make it far more difficult (for me! though perhaps not for you or hypothetical maintainers) to debug / maintain when something eventually goes wrong. Even if it isn't the ultimate cause of a bug, when a bug arises this code should be considered as part of the machinery that is under examination -- and I don't think I could do that well. I am reasonably comfortable with nix and rust, which is why I thought I could help on the respective teams, but I never learned the C family!

That said, your explanatory comments in the code are wonderful and give me a fighting chance at understanding the approach you've taken.

Yes, but they're running espanso launcher, not espanso worker. I just wanted to point out that this is another indicator that the launcher does more than simply restarting espanso on failures.

Correct -- and this is part of my point. Both NixOS and Home-Manager modules are currently using espanso daemon -- which is also undocumented but seems to have been working well so far. The documented / blessed path for Linux is to imperatively run espanso service register and then espanso start thereafter -- it doesn't mention launcher AFAICT. It even does runtime checks to make sure the systemd file points to the expected binary and expects to find it at /usr/local/bin/espanso. For MacOS, the only documented install processes are downloading the .app to Applications or homebrew, and espanso installs its own plist to manage the service, which also uses espanso launcher (not espanso start).

So whichever way we go, just by using nix, I think we're already pretty far off the well documented path.

It seems that @KenMacD has been using the espanso worker approach for months with generally satisfactory results while the rest of us wayland suckers languish :)

@pitkling
Copy link
Member

pitkling commented Sep 7, 2024

[…] Also, regarding your PR -- IMO users installing espanso with package = pkgs.espanso-wayland should -- with documentation -- be opting into the wrapper by default. If espanso cannot work on wayland without these capabilities, and we're not going to provide a service that runs as root by default, I think that choice should be enough for the rest to configure itself to be usable. What are your thoughts on simplifying some of your PR to make the module "just work" based on the user's choice of package?

I just did update #328890. It is now its own small module programs.espanso.capdacoverride that, if enabled (right now the default, but eventually it should default to disabled and be activated either manually or by the NixOS Espanso module) makes sure that everything works out of the box. To do so it creates the security wrapper and overlays pkgs.espanso-wayland with the capability-enabled version. I'm not sure whether automatic overlays by modules are a good idea, but at least makes sure that pkgs.espanso-wayland just works, no matter where it is used.

(BTW: I also just tested myself that running espanso worker directly does not pick up configuration changes, while that works flawlessly with #328890).

That's also why I still think the approach from #328890 requires probably lower maintenance of the NixOS module code (major concern for #328890) than the approach here.

Did you mean to put the same PR for both of those links? Assuming so -- that's interesting, I still worry that the complexity of that approach would make it far more difficult (for me! though perhaps not for you or hypothetical maintainers) to debug / maintain when something eventually goes wrong. Even if it isn't the ultimate cause of a bug, when a bug arises this code should be considered as part of the machinery that is under examination -- and I don't think I could do that well. I am reasonably comfortable with nix and rust, which is why I thought I could help on the respective teams, but I never learned the C family!

That said, your explanatory comments in the code are wonderful and give me a fighting chance at understanding the approach you've taken.

I absolutely get your reservation, and if you don't feel those changes are worth the hassle, that's fine :). If it helps, I'd be willing to maintain that part if wanted. BUT I should say that I have never written a single line of Rust code, so not sure how helpful I can be with issues related to Espanso internals.

But yes, I think it's lower maintenance (I meant PR #328890 in both cases). I think it is pretty orthogonal to any changes to the espanso package and module. I cannot see many upstream changes that would cause hiccups with the wrapper; the most likely breaking change would be that Espanso no longer uses the libc-based fork, but then the wrapper just behaves as the original. Also, one can easily disable the wrapper completely via programs.espanso.capdacoverride.enable = false; with this, there is actually no difference at all to the status quo (so one can easily check whether the wrapper is causing problems or not).

Note that the way #328890 is implemented now does neither touch the Espanso package definition nor the Espanso module code directly. The way I wrap the Espanso package in nixos/modules/programs/espanso-capdacoverride/espanso-capdacoverride.nix is quite orthogonal to the original package definition. And by overlaying pkgs.espanso-wayland (again, not sure this is a good approach) it requires no changes to the espanso NixOS module at all and immediately works for the corresponding home-manager module (or any other place using pkgs.espanso-wayland) without any changes.

Yes, but they're running espanso launcher, not espanso worker. I just wanted to point out that this is another indicator that the launcher does more than simply restarting espanso on failures.

Correct -- and this is part of my point. Both NixOS and Home-Manager modules are currently using espanso daemon -- which is also undocumented but seems to have been working well so far. […]

So whichever way we go, just by using nix, I think we're already pretty far off the well documented path. […]

Ah, I wasn't aware that daemon and launcher are also not documented. Very good point! That definitely eases my concerns with respect to the stability of worker. Still, we would have at least to figure out what problems may occur (like not reloading the configuration) without the monitoring code from launcher and whether one can fix them somehow (or live with them).

};
config =
let
wayland = cfg.package == pkgs.espanso-wayland;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to do something like

Suggested change
wayland = cfg.package == pkgs.espanso-wayland;
wayland = builtins.elem "wayland" cfg.package.buildFeatures;

Otherwise this will probably not work if one pulls an espanso-wayland package from another nixpkgs branch (e.g., if one runs nixos-2405 but pulls espanso-wayland from nixpkgs-unstable).

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/espanso-daemon-problem/35309/27

@n8henrie
Copy link
Contributor Author

Closing in favor of #328890

@n8henrie n8henrie closed this Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10
Projects
None yet
Development

Successfully merging this pull request may close these issues.

espanso-wayland: service start failure
3 participants