Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for SSH connection via aliases from ~/.ssh/config #790

Merged
merged 7 commits into from
Feb 20, 2024

Conversation

wbeardall
Copy link
Contributor

Add support for SSH connection via aliases from ~/.ssh/config

Motivation

For commonly-used connections, machines with pre-existing SSH configurations, and general convenience, we find it useful to be able to infer (or interpolate, as you please) a full connection configuration from the host alias in the standard SSH config file.

We suggest that configurations as specified as such should only be used as default values, and if alternatives username and port are provided as part of the file URI, these should override pre-existing values. For example, ssh://new-user@host-alias:<new-port>/... should override the pre-defined port and username in the SSH configuration file.

Implementation Details

As currently implemented, the connection configuration interpolation supports the following configuration options, which are passed to paramiko.client.SSHClient.connect:

  • hostname
  • port
  • username
  • key_filename
  • timeout
  • compress
  • gss_auth
  • gss_kex
  • gss_deleg_creds
  • gss_trust_dns

As an example, the following configuration would be fully and properly utilised when called with smart_open.open("ssh://another-host/...")

Host another-host
  HostName another-host-domain.com
  User another-user
  Port 2345
  IdentityFile /path/to/key/file
  ConnectTimeout 20
  Compression yes
  GSSAPIAuthentication no
  GSSAPIKeyExchange no
  GSSAPIDelegateCredentials no
  GSSAPITrustDns no

Tests

Additional unit tests have been added in smart_open/tests/test_ssh.py, following the existing test pattern.

Checklist

Before you create the PR, please make sure you have:

  • Picked a concise, informative and complete title
  • Clearly explained the motivation behind the PR
  • Included tests for any new functionality
  • Checked that all unit tests pass

@@ -65,7 +79,7 @@ def parse_uri(uri_as_string):
uri_path=_unquote(split_uri.path),
user=_unquote(split_uri.username),
host=split_uri.hostname,
port=int(split_uri.port or DEFAULT_PORT),
port=int(split_uri.port) if split_uri.port else None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we setting port to None instead of the default here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I recall correctly, this is to ensure that non-default ports from configuration files are loaded correctly. We think it best that the parse_uri function should not inject additional information, such as a default port, to a URI, as this is the job of the configuration parser. This is important for cases where a non-default port is specified in the config file for a connection, but not in the URI as-provided. If parse_uri injects the default here, then the non-default port from the config will be ignored, which is not what we want. We only want to override ports specified in the config if explicitly provided by the user as part of the URI.

smart_open/ssh.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@mpenkov mpenkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@mpenkov mpenkov merged commit 269c3a2 into piskvorky:develop Feb 20, 2024
21 checks passed
ddelange added a commit to ddelange/smart_open that referenced this pull request Feb 21, 2024
…open into patch-2

* 'develop' of https://github.com/RaRe-Technologies/smart_open:
  Propagate __exit__ call to underlying filestream (piskvorky#786)
  Retry finalizing multipart s3 upload (piskvorky#785)
  Fix `KeyError: 'ContentRange'` when received full content from S3 (piskvorky#789)
  Add support for SSH connection via aliases from `~/.ssh/config` (piskvorky#790)
  Make calls to smart_open.open() for GCS 1000x faster by avoiding unnecessary GCS API call (piskvorky#788)
  Add zstandard compression feature (piskvorky#801)
  Support moto 4 & 5 (piskvorky#802)
  Secure the connection using SSL when connecting to the FTPS server (piskvorky#793)
  upgrade dev status classifier to stable (piskvorky#798)
  Fix formatting of python code (piskvorky#795)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants