Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot fork after attaching in ruby >= 2.6 #44

Open
JimScadden opened this issue Jun 12, 2020 · 2 comments
Open

Cannot fork after attaching in ruby >= 2.6 #44

JimScadden opened this issue Jun 12, 2020 · 2 comments

Comments

@JimScadden
Copy link

Example code:

require 'lxc'

old_sync = $stdout.sync
$stdout.sync = true

ct = LXC::Container.new('container')
puts "#{Process.pid} Attaching to container"
exitcode = ct.attach({wait: true}) do
  puts "#{Process.pid} Inside container. Forking"
  fork do
    puts "#{Process.pid} Forked :)"
  end
end

This used to work fine in ruby 2.5:

# ruby --version
ruby 2.5.8p224 (2020-03-31 revision 67882) [x86_64-linux]
# ruby test.rb
138201 Attaching to container
26532 Inside container. Forking
26533 Forked :)

However it seems to trigger an internal ruby error (at https://github.com/ruby/ruby/blob/510df47f5f7f83918d3aa00316c8a5b959d80d7c/thread_pthread.c#L1695) in ruby 2.6 / 2.7:

# ruby --version
ruby 2.6.6p146 (2020-03-31 revision 67876) [x86_64-linux]
# ruby test.rb 2>&1 | head
138686 Attaching to container 
26536 Inside container. Forking
test.rb:10: [BUG] timer_posix was not dead: 0
           
ruby 2.6.6p146 (2020-03-31 revision 67876) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0005 p:---- s:0021 e:000020 CFUNC  :fork
c:0004 p:0035 s:0017 e:000016 BLOCK  test.rb:10 [FINISH]
c:0003 p:---- s:0014 e:000013 CFUNC  :attach
# ruby --version
ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
# ruby test.rb 2>&1 | head
138889 Attaching to container
26538 Inside container. Forking
test.rb:10: [BUG] timer_posix was not dead: 0

ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0005 p:---- s:0021 e:000020 CFUNC  :fork
c:0004 p:0033 s:0017 e:000016 BLOCK  test.rb:10 [FINISH]
c:0003 p:---- s:0014 e:000013 CFUNC  :attach

I've tested in CentOS 7.7, Debian buster (and sid)

@sitano
Copy link

sitano commented Nov 16, 2021

nice catch. I did not check and look but I would blindly suppose it's because lxc_spawn incorrectly forks a Ruby VM. Considering invalid previous state of the timer thread meaning it was not shutdown properly before. Ruby VM Process.fork has some mechanics beyond the clone() syscall that cleans up schedulers and threads data. In 2.4 it shuts down the timer before the fork.

@sitano
Copy link

sitano commented Nov 18, 2021

just by replacing fork+clone3(CLONE_PARENT) with the simple call to rb_fork_ruby everything suddenly starts working. however, it requires a patch to both lxc and ruby-lxc. The downside is that the parent process either loses child or it requires to use CHILD_REAPER flag instead of the second call to clone. Anyway. The prototype works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants