Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the mechanism for escaping tunable and discoverable & send raw ^C's directly #297

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

lifning
Copy link

@lifning lifning commented Jan 12, 2023

(edited, for context this used to just be a commit that added a warning message the first time you hit a raw Ctrl+C, now it changes the behavior of the escape handling altogether as described below)

This changes the method of entering an escape sequence:

  • raw Ctrl+C gets sent to the VM unimpeded.
  • by default, the sequence Ctrl+], Ctrl+C is used to quit the program (^]^C)
  • this can be customized or removed via CLI flags, allowing the string be of arbitrary length.
    • i.e. if you propolis-cli serial -e "beans" and then type "bea", nothing gets sent to the VM after the "b" yet. and then if you type:
      1. "k", the VM gets sent "beak"
      2. '"ns", the VM doesn't get sent anything else, and the client exits.
  • the client can be configured to pass through an arbitrary prefix length of the escape string before it starts suppressing inputs, such that you can, for example, mimic ssh's Enter-tilde-dot sequence without temporarily suppressing Enter presses not intended to start an escape sequence, which would interfere with function:
    -e '^M~.' --escape-prefix-length=1
    (this also includes a workaround for ANSI escape sequences being sent by xterm-like emulators when Enter is pressed in a shell that sends a request for such)

Hopefully no more accidentally killing the client!

@jclulow
Copy link
Contributor

jclulow commented Jan 12, 2023

I think it would be good to use an escape sequence like ssh (and tip and sercons and zlogin -C do for breaking the connection (and potentially taking other actions).

For ssh the default escape character is tilde (~) so I would pick # by default for this to avoid the most common conflict. Then, pressing ENTER followed by the character (hash) followed by a period would terminate the session. You could also add a help menu if one presses ENTER # ? like ssh has as well. Eventually it we can support sending serial breaks we could do that through ENTER # b as well, etc.

I'd also add an -e flag to the program to allow the escape character to be overridden.

@dancrossnyc
Copy link
Contributor

In general I agree that control characters should be ignored locally in favor of an escape character.

# is a pretty common character, including as a comment indicator for config files and like, though. I can imagine someone editing a config file via a serial console and cursing us greatly if we choose it. Then again, it was also the erase character on early Unix, so...maybe it's not so bad.

Perhaps consider one of the lesser-used control characters? Ones I've seen used include ^E, ^, and ^^.

@jordanhendricks
Copy link
Contributor

jordanhendricks commented Jan 12, 2023

One concern I have is that mixing messages from the tool with output on the console might be confusing. Perhaps the instructions on how to escape the console could be in the CLI command help message? (I think this is unlikely to come up for the web console version, but if it did that could perhaps be help text somewhere on the page outside of the console itself.)

@pfmooney
Copy link
Collaborator

I think it would be good to use an escape sequence like ssh (and tip and sercons and zlogin -C do for breaking the connection (and potentially taking other actions).

For ssh the default escape character is tilde (~) so I would pick # by default for this to avoid the most common conflict. Then, pressing ENTER followed by the character (hash) followed by a period would terminate the session. You could also add a help menu if one presses ENTER # ? like ssh has as well. Eventually it we can support sending serial breaks we could do that through ENTER # b as well, etc.

I second this.

Many moons ago, I'd drafted a change to propolis-cli to implement this (<cr>#. as exit, like sercons), but it was stymied by the fact that the guest shell prompt was emitting some escape sequence which was causing the local terminal to respond with a similarly escaped sequence. This was not visible to the user, but since it occurred after the <cr>, but before a subsequent attempt at the #., the functionality was ruined. My effort stalled before teaching that logic to ignore escaped terminal sequences. I'm not sure if there's an approach which makes this problem easier to tackle.

# is a pretty common character, including as a comment indicator for config files and like, though. I can imagine someone editing a config file via a serial console and cursing us greatly if we choose it. Then again, it was also the erase character on early Unix, so...maybe it's not so bad.

It works fine in sercons, since a strict <cr>#. is required.

One concern I have is that mixing messages from the tool with output on the console might be confusing.

Agreed.

@jclulow
Copy link
Contributor

jclulow commented Jan 12, 2023

Yeah mixing console output is definitely confusing. In sercons we print a message prior to attaching about what sequence to press to detach, and then don't otherwise intermingle output. SSH only emits stuff when you use something explicit like the help screen.

sercons is https://github.com/jclulow/vmware-sercons FWIW and I use it to attach to the serial console in propolis standalone which is just a UNIX socket. It's been pretty good.

@dancrossnyc
Copy link
Contributor

I think it would be good to use an escape sequence like ssh (and tip and sercons and zlogin -C do for breaking the connection (and potentially taking other actions).
For ssh the default escape character is tilde (~) so I would pick # by default for this to avoid the most common conflict. Then, pressing ENTER followed by the character (hash) followed by a period would terminate the session. You could also add a help menu if one presses ENTER # ? like ssh has as well. Eventually it we can support sending serial breaks we could do that through ENTER # b as well, etc.

I second this.

Many moons ago, I'd drafted a change to propolis-cli to implement this (<cr>#. as exit, like sercons), but it was stymied by the fact that the guest shell prompt was emitting some escape sequence which was causing the local terminal to respond with a similarly escaped sequence. This was not visible to the user, but since it occurred after the <cr>, but before a subsequent attempt at the #., the functionality was ruined. My effort stalled before teaching that logic to ignore escaped terminal sequences. I'm not sure if there's an approach which makes this problem easier to tackle.

Wouldn't a way around that be to use an uncommon control character by itself? There's precedent for this; e.g. telnet used ^].

# is a pretty common character, including as a comment indicator for config files and like, though. I can imagine someone editing a config file via a serial console and cursing us greatly if we choose it. Then again, it was also the erase character on early Unix, so...maybe it's not so bad.

It works fine in sercons, since a strict <cr>#. is required.

That's literally what you're going to type when adding a multiline comment at the top of, say, /etc/hosts, isn't it?

# Added to work around a DNS bug. It's always DNS.
# Remove on 2023-12-31

etc.

One concern I have is that mixing messages from the tool with output on the console might be confusing.

Agreed.

There's precedent here we can look at; for example telnet.

@jclulow
Copy link
Contributor

jclulow commented Jan 13, 2023

Wouldn't a way around that be to use an uncommon control character by itself? There's precedent for this; e.g. telnet used ^].

A single byte escape sequence is not great, because there is no way to escape the original byte when you need to generate it -- especially when the escape sequence is just terminating the session, not opening a prompt like telnet does.

It works fine in sercons, since a strict <cr>#. is required.

That's literally what you're going to type when adding a multiline comment at the top of, say, /etc/hosts, isn't it?

# Added to work around a DNS bug. It's always DNS.
# Remove on 2023-12-31

I don't believe so. Note that it is a three byte sequence: <CR>, #, . (period). Unless you are starting a comment line with #. I don't believe this would affect you.

In some software like ssh, the third byte can also be ? (for help), B to send a break, or the escape character itself to actually pass through the escape character literally, etc.

If you do want to start a comment line with #. I believe the sequence you would type would be: <CR>, #, #, ..

There's precedent here we can look at; for example telnet.

There's also precedent in tip, ssh, sercons, mlogin, the Sun LOMlite2 and ALOM console stuff, etc.

@dancrossnyc
Copy link
Contributor

Wouldn't a way around that be to use an uncommon control character by itself? There's precedent for this; e.g. telnet used ^].

A single byte escape sequence is not great, because there is no way to escape the original byte when you need to generate it -- especially when the escape sequence is just terminating the session, not opening a prompt like telnet does.

Dunno. It worked fine for 15 or so years until people stopped regularly using telnet. Even now, it's not so onerous when one does use telnet for something (that's not usually the telnet protocol). Even then, it wasn't the control character that was the problem.

But let me be clear here: I'm not saying you shouldn't precede the escape character by a or whatever, but I am suggesting it's a bit surprising that it's a fairly common printing character.

It works fine in sercons, since a strict <cr>#. is required.

That's literally what you're going to type when adding a multiline comment at the top of, say, /etc/hosts, isn't it?

# Added to work around a DNS bug. It's always DNS.
# Remove on 2023-12-31

I don't believe so. Note that it is a three byte sequence: <CR>, #, . (period). Unless you are starting a comment line with #. I don't believe this would affect you.

This is moving the goal posts a tad. The original suggestion was, "to use an escape sequence like ssh (and tip and sercons and zlogin -C do for breaking the connection (and potentially taking other actions)." It's true that sercons seems to only support that single escape sequence, but cu and the other commands support others actions; all I'm saying is that using a fairly common printing character could, perhaps, be improved upon.

In some software like ssh, the third byte can also be ? (for help), B to send a break, or the escape character itself to actually pass through the escape character literally, etc.

If you do want to start a comment line with #. I believe the sequence you would type would be: <CR>, #, #, ..

Wouldn't that start the line with ##. in sercons today?

There's precedent here we can look at; for example telnet.

There's also precedent in tip, ssh, sercons, mlogin, the Sun LOMlite2 and ALOM console stuff, etc.

I was referring specifically to differentiating between tool output and console output here.

@lifning
Copy link
Author

lifning commented Jan 13, 2023

well these all sound like improvements of some sort over the current state of affairs -- i can certainly say the existing scheme of ^A^C is a pain in the neck to input when i run propolis-cli in (tmux with the muscle-memory bindings of) GNU screen, for instance. definitely agree on adding -e escape_seq (and -E for no escape) as an option.

the <CR>~. scheme that ssh et al. have kinda rubs me the wrong way personally, largely because it requires you to send that <CR> to the thing you're connected to first -- forcing you to mutate its state in the process, when all you wanted to do was disconnect.

and evidently the choice of escape sequence in this paradigm (as ourselves choosing a default or as the user passing one to -e) will inevitably be cause for worrying about some number of caveats, since there's reasons you might start a line with any given printable character, but if it's configurable by a well-documented command-line switch, i suppose something like using ^]^C itself as a default seems relatively unlikely to bite us -- as in, ^]^C kills the client, ^] followed by anything else sends ^] followed by whatever else, raw ^C goes to the VM.

making it a cli switch also neatly solves the problem of making it discoverable, which means we certainly don't have to worry about asking if the user wanted to SIGINT us or send it across on stderr like this; they can hopefully think to --help.

(edit: fwiw, the change i pushed just now can be made to behave like ssh if you pass -e '^M~.', but with the downside of the newline not being sent until you type a non-~ character afterward. i might add a special behavior for this if your string begins with ^M specifically)

@lifning lifning force-pushed the discoverable-ctrlc-cli branch 2 times, most recently from ba3d76a to 0f16274 Compare January 14, 2023 08:52
@lifning
Copy link
Author

lifning commented Jan 14, 2023

Many moons ago, I'd drafted a change to propolis-cli to implement this (<cr>#. as exit, like sercons), but it was stymied by the fact that the guest shell prompt was emitting some escape sequence which was causing the local terminal to respond with a similarly escaped sequence. This was not visible to the user, but since it occurred after the <cr>, but before a subsequent attempt at the #., the functionality was ruined. My effort stalled before teaching that logic to ignore escaped terminal sequences. I'm not sure if there's an approach which makes this problem easier to tackle.

well, i just now threw a regex at the problem and somehow didn't end up with two problems. :)

@lifning lifning force-pushed the discoverable-ctrlc-cli branch 2 times, most recently from 93a3aa4 to 1f3ee82 Compare January 14, 2023 09:02
@lifning lifning changed the title Make the mechanism for escaping ^C more discoverable Make the mechanism for escaping tunable and discoverable & send raw ^C's directly Jan 14, 2023
@rcgoodfellow
Copy link
Contributor

Thanks for doing this @lifning, this will allow Falcon drop some copied code and use the propolis client serial implementation directly (issue linked above).

Something that would be really nice as a follow on, would be to expose this functionality through a library for direct consumption in other programs rather than having to shell out.

@lifning lifning force-pushed the discoverable-ctrlc-cli branch 2 times, most recently from 30b5bb5 to ec2a487 Compare January 19, 2023 21:06
Copy link
Contributor

@jordanhendricks jordanhendricks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started to review this in more detail, but I have some high level questions/feedback that I think I want to hash out first.

I like the spirit of being able to specify custom escape sequences, but I have a few concerns with the approach. Specifically:

  • I'm worried that that prefix length being in bytes and the escape sequence allowing multibyte characters (I think) means that you could construct a string that ends in the middle of a character and send that to the guest. That seems like it could have some security implications or just generally cause surprising behavior.
  • On that note, I think I would want to be a bit more up front about what characters are allowed in the escape sequence. Is it UTF-8? Should we restrict the available set of characters somehow?
  • This might be my own ignorance here, but I'm not sure why a regex is needed to check for the escape string when the previous implementation didn't use one (and was much clearer to me what was going on).
  • I think the logic that checks for the escape sequence could benefit from some "helper" functions that pull out the checks into named functions to help explain what's going on. And a block comment of some kind to explain why we are doing this would help.

I left a couple other comments from when I had started reviewing, but they're minor and I wanted to raise these things first.

bin/propolis-cli/src/main.rs Outdated Show resolved Hide resolved
bin/propolis-cli/src/main.rs Outdated Show resolved Hide resolved
bin/propolis-cli/src/main.rs Outdated Show resolved Hide resolved
bin/propolis-cli/src/main.rs Outdated Show resolved Hide resolved
bin/propolis-cli/src/main.rs Outdated Show resolved Hide resolved
@lifning lifning force-pushed the discoverable-ctrlc-cli branch from ec2a487 to b6e8f63 Compare February 8, 2023 07:45
@lifning
Copy link
Author

lifning commented Feb 8, 2023

  • I'm worried that that prefix length being in bytes and the escape sequence allowing multibyte characters (I think) means that you could construct a string that ends in the middle of a character and send that to the guest. That seems like it could have some security implications or just generally cause surprising behavior.
  • On that note, I think I would want to be a bit more up front about what characters are allowed in the escape sequence. Is it UTF-8? Should we restrict the available set of characters somehow?

the escape sequence does allow multibyte characters -- which clap does happen to require to be valid UTF-8, despite being carried in an OsString. because of this restriction, it should mean that any partial versions of multibyte characters shouldn't be able to be specified as an escape sequence1. but even if they could, the worst that would happen from an operator typing a multibyte unicode character whose prefix is the escape sequence would be that the cli closes itself (and their input was already consumed on its stdin before being fed to the escape sequence parser, so the remaining bytes won't get sent to the cli's parent process).

i hesitated to restrict things here because i don't/can't know what manner of obscure byte sequence someone might hypothetically want to use someday, particularly operators using more interesting keyboards (various international ones, or soft keyboards on the smartphone an on-call operator getting paged while away from a PC might be remoting in from -- if they wanna make like a tree and --escape-string=🍃, who am i to judge). though i suppose we don't have to support everything so long as we give folks enough options to avoid accidentally bailing while still being able to at all.

  • This might be my own ignorance here, but I'm not sure why a regex is needed to check for the escape string when the previous implementation didn't use one (and was much clearer to me what was going on).

the use of a regex here is specifically to work around the issue Patrick described encountering in a comment above, which turned out to be: when you send a newline to most shells, they'll output an ANSI escape sequence for cursor position request, to which xterm(-like) terminal emulators will respond with the ANSI escape sequence CSI row ; column R. if your escape sequence is anything like the "newline, tilde, dot" from ssh, then you'll never be able to enter it from an xterm, because your terminal emulator will interject with that escape sequence between your typing the newline and tilde! we didn't need this before, because we didn't do anything involving a newline being sent, but supporting such a scheme is evidently very desirable given the discussion here. (this could've been implemented without a regex of course, but inventing an entire bespoke DFA for this complex enough to handle this sequence seemed like overkill)

  • I think the logic that checks for the escape sequence could benefit from some "helper" functions that pull out the checks into named functions to help explain what's going on. And a block comment of some kind to explain why we are doing this would help.

will do!

Footnotes

  1. at least, if i'm understanding the structure of UTF-8's codepoints correctly. it should be impossible for a string to be specified that starts in the middle of a character (no valid encoding for byte 1 matching 10xxxxxx), or ends in the middle of it (once you've committed to byte 1's specification of how many additional bytes to expect, you can't cut it short without it being invalid)

@lifning lifning force-pushed the discoverable-ctrlc-cli branch 2 times, most recently from bdc0d97 to 4ee7324 Compare February 15, 2023 00:49
@lifning lifning force-pushed the discoverable-ctrlc-cli branch from 4ee7324 to 69f2578 Compare March 21, 2023 08:15
@lifning lifning force-pushed the discoverable-ctrlc-cli branch from 69f2578 to 5802f52 Compare May 6, 2023 08:50
This changes the method of entering an escape sequence:
- raw Ctrl+C gets sent to the VM unimpeded.
- by default, the sequence Ctrl+], Ctrl+C is used to quit the program
  (`^]^C`)
- this can be customized or removed via CLI flags, allowing the string
  be of arbitrary length.
  - i.e. if you `propolis-cli serial -e "beans"` and then type "bea",
    nothing gets sent to the VM after the "b" yet. and then if you type:
    1. "k", the VM gets sent "beak"
    2. '"ns", the VM doesn't get sent anything else, and the client
       exits.
- the client can be configured to pass through an arbitrary prefix
  length of the escape string before it starts suppressing inputs, such
  that you can, for example, mimic ssh's Enter-tilde-dot sequence
  without temporarily suppressing Enter presses not intended to
  start an escape sequence, which would interfere with function:
  `-e '^M~.' --escape-prefix-length=1` (this also works around ANSI
  escape sequences being sent by xterm-like emulators when Enter is
  pressed in a shell that sends a request for such)

Much of this logic, including RawTermiosGuard, has now been factored
out into the https://github.com/oxidecomputer/thouart crate.
@lifning lifning force-pushed the discoverable-ctrlc-cli branch from 5802f52 to 2633a39 Compare May 17, 2023 00:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants