-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SSHRemoteIO
on windows hangs
#68
Comments
#69 narrows this down to |
Debugging log on windows: Plain ssh works fine
With
With
and it can be repeatedly hung:
What seems important is that the function returns (the return value is there), and there is a new prompt, but afterwards it "hangs", as if the interpreter is not accepting new inputs. once I kill the Python process, I can reuse the terminal session with a new Python session without issues, and I can make it hang at that exact same spot. I can also make it hang, if I run that exact same command directly via the runner:
and also without any DataLad at all
But it needs the python session, outside (but in the same terminal session), it runs fine, any number of times:
|
If close_fds is true, all file descriptors except 0, 1 and 2 will be closed before the child process is executed. Otherwise when close_fds is false, file descriptors obey their inheritable flag as described in Inheritance of File Descriptors. On Windows, if close_fds is true then no handles will be inherited by the child process unless explicitly passed in the handle_list element of STARTUPINFO.lpAttributeList, or by standard handle redirection. Changed in version 3.2: The default for close_fds was changed from False to what is described above. Changed in version 3.7: On Windows the default for close_fds was changed from False to True when redirecting the standard handles. It’s now possible to set close_fds to True when redirecting the standard handles. |
I tried a few more things: versions
|
Maybe some of these Windows specific subprocess configurations could be relevant: https://docs.python.org/3/library/subprocess.html#windows-popen-helpers |
I found a Gist that made things work: https://gist.github.com/josephcoombe/3a234721fc5a6885ca4f91e3a27860f4. (it needs an additional |
I don't understand subprocess well, so I'm just sharing a few observations.
Edit: Edit: Edit: I have tested that the call
works (in CMD) if:
|
I am still not sure what the root cause of the problem is. On Windows 10, in >>> import subprocess
>>> subprocess.run(['ssh', 'unix-machine', 'uname'])
>>> print('after run') (Weirdly enough, no other python interpreter started in the same powershell session (after killing the hanging python process) is accepting interactive input).
InterpretationWithout overriding import subprocess
subprocess.run(['ssh', 'unix-machine', 'uname'])
print('after run')
print(input('enter something> ')) But this program will print out individual characters read from import sys
import subprocess
subprocess.run(['ssh', 'unix-machine', 'uname'])
print('after run')
while True:
print(read(sys.stdin)) Not sure about the conclusion yet. [Edit] [Edit 2] |
This all sounds like we understand how to avoid the problem. What I do not understand is what a fix could look like. We need a fix that is SSH-specific. But the changes in handling described here seem only possible deep in the Runner code.
Looking further,
That seems to imply this can only be fixed by supporting an additional argument type/value for Ping @christian-monch |
I think it can and should be fixed by calling the runner differently when executing
Although this might not be required for the solution of this problem, it might be a good idea to support I am on it. |
Isolating the interpreters stdin-descriptor from
Will not hang, with the line: con('uname') is replaced with con('uname', stdin=b'') I am looking into the code-paths that lead to ssh touching the interpreter's stdin-stream to see whether we can prevent that in the new ria-remote implementation, without any changes in datalad-core |
Before this is closed, we must file a companion issue in datalad-core. |
This patch aims to fix a hanging Python sessions after the execution of an SSH remote command call with no particular stdin input. Interpretation from #68 Without overriding stdin, the subprocess, i.e. ssh, and python share the same file pointer. It seems that stdin is configured in a way that unexpected by the interpreter and messes with python's way to read from sys.stdin. This patch passes an explicit `b''` as `stdin` to the SSH client execution process to effectively achieve a separate fiel descriptor for that client process. This patch should not interfere with the implementation of the `sshrun` command in datalad-core. It uses a dedicated not-None value for any execution. However, the compatibility and interference of this patch should be subject to a thorough investigation and widespread testing before this changeset is proposed for datalad-core. Closes #68
This patch aims to fix a hanging Python sessions after the execution of an SSH remote command call with no particular stdin input. Interpretation from #68 Without overriding stdin, the subprocess, i.e. ssh, and python share the same file pointer. It seems that stdin is configured in a way that unexpected by the interpreter and messes with python's way to read from sys.stdin. This patch passes an explicit `b''` as `stdin` to the SSH client execution process to effectively achieve a separate fiel descriptor for that client process. This patch should not interfere with the implementation of the `sshrun` command in datalad-core. It uses a dedicated not-None value for any execution. However, the compatibility and interference of this patch should be subject to a thorough investigation and widespread testing before this changeset is proposed for datalad-core. Closes #68
Reopening: Some breakage is fixed, but |
This is an interim conclusion from #58
A test tries to run
init_remote()
for anora
remote, not even expecting it to work, but error out due to insufficiently meet preconditions. But it just hangs.This may be due to the fact that the target connection is via SSH and the target port is ignored by the implementation of
SSHRemoteIO
from core. but the same code passes on mac and linux.This could be investigated by deploying a matching ssh config that declares the port.
The text was updated successfully, but these errors were encountered: