Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

systemd: 255.9 -> 256.2 #307068

Merged
merged 7 commits into from
Jul 21, 2024
Merged

systemd: 255.9 -> 256.2 #307068

merged 7 commits into from
Jul 21, 2024

Conversation

nikstur
Copy link
Contributor

@nikstur nikstur commented Apr 26, 2024

Description of changes

Closes #319328

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.05 Release Notes (or backporting 23.05 and 23.11 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

@lukaslihotzki
Copy link
Contributor

With this PR, the generated initrd contains a libsystemd-shared-256.so that contains the string /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-kmod-31-lib/lib/libkmod.so.2. The cause is probably that systemd uses dlopen instead of dynamic linking for libkmod.so.2.

This is an invalid path, so libkmod cannot be loaded, so the initrd fails to load any kernel module.

@lukaslihotzki
Copy link
Contributor

This PR works for me when using a kernel that has everything built-in which is needed to boot, so no modules need to be loaded in stage 1. Therefore, kmod seems to be the only major thing that is currently broken.

@github-actions github-actions bot added 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` labels Apr 29, 2024
@nikstur
Copy link
Contributor Author

nikstur commented Apr 29, 2024

Should be fixed now for the systemd initrd. We needed to explicitly include libkmod in the initrd.

The scripted initrd, however, is not working with the updated systemd and I have no idea how to fix it since I never use the scripted initrd.

@MinerSebas
Copy link
Contributor

The changelog contains this entry:

    * The behavior of systemd-sleep and systemd-homed has been updated to
      freeze user sessions when entering the various sleep modes or when
      locking a homed-managed home area. This is known to cause issues with
      the proprietary NVIDIA drivers. Packagers of the NVIDIA proprietary
      drivers may want to add drop-in configuration files that set
      SYSTEMD_SLEEP_FREEZE_USER_SESSION=false for systemd-suspend.service
      and related services, and SYSTEMD_HOME_LOCK_FREEZE_SESSION=false for
      systemd-homed.service.

but this PR does not set the mentioned configuration values when using the proprietary NVIDIA drivers.

@lukaslihotzki
Copy link
Contributor

Indeed, this PR works with mandatory kernel modules when setting boot.initrd.systemd.enable = true;. 🎉

For merging this, either the scripted initrd needs to be fixed, or #287308 needs to be completed. Both options seem to be difficult. For fixing scripted initrd, this is a start:

diff --git a/nixos/modules/system/boot/stage-1.nix b/nixos/modules/system/boot/stage-1.nix
index ae05bc5ae88c..20630e686580 100644
--- a/nixos/modules/system/boot/stage-1.nix
+++ b/nixos/modules/system/boot/stage-1.nix
@@ -171,6 +171,9 @@ let
       # Copy ld manually since it isn't detected correctly
       cp -pv ${pkgs.stdenv.cc.libc.out}/lib/ld*.so.? $out/lib

+      # Copy libkmod because it is dlopened by systemd >=256
+      cp -pv ${pkgs.kmod.lib}/lib/libkmod.so.? $out/lib
+
       # Copy all of the needed libraries in a consistent order so
       # duplicates are resolved the same way.
       find $out/bin $out/lib -type f | sort | while read BIN; do

This includes libkmod.so.2 in the initrd, but still fails at runtime.

@nikstur
Copy link
Contributor Author

nikstur commented May 9, 2024

The way the scripted initrd patchelfs binaries is incompatible with the way we replace the .so references with absolute paths in our systemd derivation. This didn't matter in the past because libkmod (needed by udevadm) wasn't a dlopen dependency in previous systemd versions.

@wegank wegank added the 2.status: merge conflict This PR has merge conflicts with the target branch label May 22, 2024
@ElvishJerricco ElvishJerricco mentioned this pull request May 28, 2024
13 tasks
@flokli
Copy link
Contributor

flokli commented Jun 10, 2024

@nikstur can you rebase this?

@tulilirockz
Copy link
Contributor

Systemd 256 officially released in systemd-stable, should be able to use that now

@jmbaur
Copy link
Contributor

jmbaur commented Jun 12, 2024

Looks like stable releases will be made from one repo now: https://github.com/systemd/systemd/blob/2af17b5e4c1aa67ed5bcaa105a2a36d4fac9061a/NEWS#L117

@nikstur nikstur changed the title systemd: 255.4 -> 256-rc1 systemd: 255.6 -> 256 Jun 14, 2024
@nikstur nikstur marked this pull request as ready for review June 14, 2024 20:50
@nikstur nikstur requested a review from a team as a code owner June 14, 2024 20:50
@ofborg ofborg bot removed the 2.status: merge conflict This PR has merge conflicts with the target branch label Jun 14, 2024
@ofborg ofborg bot added 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux and removed 10.rebuild-darwin: 101-500 10.rebuild-linux: 501+ 10.rebuild-linux: 5001+ labels Jul 21, 2024
@vcunat vcunat changed the title systemd: 255.6 -> 256.2 systemd: 255.9 -> 256.2 Jul 21, 2024
@wahjava
Copy link
Contributor

wahjava commented Jul 21, 2024

w00t!

@tulilirockz
Copy link
Contributor

oh my god! great job everyone that contributed to this!

@Mic92
Copy link
Member

Mic92 commented Aug 2, 2024

This broke the netboot image. Fix in: #331712
This is relevant for nixos-anywhere.

@squalus
Copy link
Member

squalus commented Aug 6, 2024

Commit 80be926 broke the autoPatchelf step in osquery.toolchain. Reversing the order of steps 2 and 3 in the new for candidate in dep loop (thus restoring the previous behavior) fixes the problem.

@squalus squalus mentioned this pull request Aug 6, 2024
13 tasks
@SuperSandro2000
Copy link
Member

I think since this update plymouth is no longer working for me but I wasn't be able to get a useful log yet other than it existed.

@arianvp
Copy link
Member

arianvp commented Aug 6, 2024

Can you create a new issue for this? We can start collecting logs there. I hope it's not too cursed :(

squalus added a commit to squalus/nixpkgs that referenced this pull request Aug 6, 2024
- Add keep_libc flag to disable the default libc handling. Intended
  to be used by systemd.
- Add autoPatchelfFlags to autoPatchelfHook for passing arguments to
  the autoPatchelf script

This reverts part of the change made in NixOS#307068 / 80be926.

Fixes NixOS#332533
@SuperSandro2000
Copy link
Member

Can you create a new issue for this? We can start collecting logs there. I hope it's not too cursed :(

#332812

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/suspend-and-unsuspend-issues/50239/1

@JohnRTitor
Copy link
Contributor

I am seeing a issue where systemd-oomd would activate in initrd and randomly kill processes like systemd-journald and plymouth.
The system does boot successfully most of the time, sometimes it fails. I am on 16Gigs of RAM, and this has started occurring after this reached unstable.

$  sudo dmesg | grep oom
[  102.330084] kswapd0 invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
[  102.330104]  oom_kill_process+0x203/0x2f0
[  102.330297] [  pid  ]   uid  tgid total_vm      rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
[  102.330309] [    963]   993   963     3645     1395        0     1395         0    73728      192          -900 systemd-oomd
[  102.330327] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/plymouth-start.service,task=plymouthd,pid=265,uid=0
[  102.330337] Out of memory: Killed process 265 (plymouthd) total-vm:115232kB, anon-rss:520kB, file-rss:11700kB, shmem-rss:0kB, UID:0 pgtables:128kB oom_score_adj:0
[  102.333497] kswapd0 invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
[  102.333512]  oom_kill_process+0x203/0x2f0
[  102.333674] [  pid  ]   uid  tgid total_vm      rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
[  102.333685] [    963]   993   963     3645     1395        0     1395         0    73728      192          -900 systemd-oomd
[  102.333702] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/NetworkManager.service,task=NetworkManager,pid=980,uid=0
[  102.333723] Out of memory: Killed process 980 (NetworkManager) total-vm:329680kB, anon-rss:0kB, file-rss:11084kB, shmem-rss:0kB, UID:0 pgtables:152kB oom_score_adj:0
[  102.335620] kswapd0 invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
[  102.335631]  oom_kill_process+0x203/0x2f0
[  102.335784] [  pid  ]   uid  tgid total_vm      rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
[  102.335794] [    963]   993   963     3645     1395        0     1395         0    73728      192          -900 systemd-oomd
[  102.335810] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/systemd-resolved.service,task=systemd-resolve,pid=964,uid=153
[  102.335818] Out of memory: Killed process 964 (systemd-resolve) total-vm:21596kB, anon-rss:3328kB, file-rss:9240kB, shmem-rss:0kB, UID:153 pgtables:84kB oom_score_adj:0
[  102.355076] kswapd0 invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
[  102.355093]  oom_kill_process+0x203/0x2f0
[  102.355285] [  pid  ]   uid  tgid total_vm      rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
[  102.355296] [    963]   993   963     3645     1235        0     1235         0    73728      192          -900 systemd-oomd
[  102.355310] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/home-manager-masum.service,task=nix-build,pid=1005,uid=1001
[  102.355319] Out of memory: Killed process 1005 (nix-build) total-vm:449524kB, anon-rss:0kB, file-rss:4740kB, shmem-rss:0kB, UID:1001 pgtables:136kB oom_score_adj:0
[  104.467223] oom_reaper: reaped process 1005 (nix-build), now anon-rss:0kB, file-rss:304kB, shmem-rss:0kB

@arianvp
Copy link
Member

arianvp commented Aug 9, 2024

Those logs are from the kernel OOM killer not systemd-oomd

@arianvp
Copy link
Member

arianvp commented Aug 9, 2024

Which is even more weird with 16GB RAM...

ElvishJerricco pushed a commit that referenced this pull request Aug 14, 2024
* autoPatchelfHook: add keep_libc flag

- Add keep_libc flag to disable the default libc handling. Intended
  to be used by systemd.
- Add autoPatchelfFlags to autoPatchelfHook for passing arguments to
  the autoPatchelf script

This reverts part of the change made in #307068 / 80be926.

Fixes #332533
Zocker1999NET added a commit to Zocker1999NET/nixpkgs that referenced this pull request Aug 21, 2024
Those options were also added with systemd 256, but sadly were missed out in NixOS#307068.

These options are documented in:
- [systemd 256 changelog](https://github.com/systemd/systemd/releases/tag/v256) (search for `UseDomains=`)
- [networkd.conf(5)](https://www.freedesktop.org/software/systemd/man/256/networkd.conf.html#UseDomains=)
- [systemd.network(5)](https://www.freedesktop.org/software/systemd/man/256/systemd.network.html#UseDomains=)
greg-hellings pushed a commit to greg-hellings/nixpkgs that referenced this pull request Aug 24, 2024
Those options were also added with systemd 256, but sadly were missed out in NixOS#307068.

These options are documented in:
- [systemd 256 changelog](https://github.com/systemd/systemd/releases/tag/v256) (search for `UseDomains=`)
- [networkd.conf(5)](https://www.freedesktop.org/software/systemd/man/256/networkd.conf.html#UseDomains=)
- [systemd.network(5)](https://www.freedesktop.org/software/systemd/man/256/systemd.network.html#UseDomains=)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: kernel The Linux kernel 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 6.topic: systemd 8.has: module (update) This PR changes an existing module in `nixos/` 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux 12.approvals: 1 This PR was reviewed and approved by one reputable person
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update request: systemd 255.6 → 256.2