Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single Thread for KEX Handling Causes Increased SSH Latency with High Concurrent Connections #647

Open
raajeive opened this issue Dec 20, 2024 · 4 comments

Comments

@raajeive
Copy link

Version

2.12.1

Bug description

The issue is related to Apache MINA SSHD #458.

In version 2.12.1, a singleton thread pool was introduced for the Key Exchange (KEX) message handler flushing. This change has led to a significant increase in the time taken to establish SSH connections when a single client frequently connects to multiple servers. With only one thread available for KEX, this architecture becomes a bottleneck.

Our application frequently SSHs to over 10,000 servers to periodically poll specific information. Due to the single-threaded KEX, all 10,000 SSH connections now rely on the same thread, causing the average time to SSH into a server to increase. This directly impacts the application's polling frequency, leading to performance degradation.

Actual behavior

A single thread is used for KEX, leading to increased SSH time when the client handles several thousand SSH connections concurrently.

Expected behavior

Provide a configurable option to set the number of threads for KEX.

Relevant log output

No response

Other information

No response

@tomaswolf
Copy link
Member

I don't understand. This is not a single thread; it's a thread pool. It sizes dynamically between zero and Integer.MAX_VALUE threads; idle threads are kept for one minute. So why are you seeing one thread only?

@raajeive
Copy link
Author

Ok, With recent upgrade to 2.12.1 from 2.9.3, we are seeing an increase in time taken to ssh and execute commands in the remote system. We are ssh to several thousand servers continuously via several hundred threads, with this upgrade we are seeing decrease in the number of threads and increase in the time taken to ssh and execute commands. Just suspecting this change #458 as it says it will use singleton class and single thread. is there any changes that went in between these version that can have this kind of behaviour?

@tomaswolf
Copy link
Member

#458 doesn't say anything about a "single thread". It uses a single thread pool for all sessions instead of creating a thread for each session.

If you suspect that change was the cause, why don't you check out the sources, undo that change, and test again?

Did you profile the application to check where the hot spots are?

The only way I can imagine this single thread pool to cause such trouble might be if we submit a lot of flush tasks that actually don't have to flush anything. If it can ascertained that this is the cause we could improve this special case such that we don't submit a flush task if there's nothing to flush.

@tomaswolf
Copy link
Member

The only way I can imagine this single thread pool to cause such trouble might be if we submit a lot of flush tasks that actually don't have to flush anything.

Actually I see now that this is already taken care of. So that can't be the source of your problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants