Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redo wire protocol race condition fix. #2164

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

majestrate
Copy link
Contributor

In commit a76acd4 we attempted to fix issues on service nodes that related to spamming service nodes with traffc, but this caused a regression where clients could not connect to the network.

However, we also did something idiotic with the protocol handshake. The handshake logic was modified and caused client connections to fail. This was due to a change in how we decided what a timed out session is. We redo the close bug and revert the change to the protocol handshake.

In commit a76acd4 we attempted to fix
issues on service nodes that related to spamming service nodes with
traffc, but this caused a regression where clients could not connect
to the network.

However, we also did something idiotic with the protocol handshake.
The handshake logic was modified and caused client connections to
fail. This was due to a change in how we decided what a timed out
session is. We redo the close bug and revert the change to the
protocol handshake.
@majestrate majestrate requested review from jagerman and dr7ana April 27, 2023 23:30
@jagerman
Copy link
Member

jagerman commented Apr 28, 2023

I'm flipping testnet to this branch for testing. Will flip it tomorrow.

@dr7ana - can you git cherry-pick b0b6e2b198ba32e8c5641e99494d2874b2243b8f into your liblokinet-dev branch (which testnet currently follows)?

dr7ana added a commit to dr7ana/lokinet that referenced this pull request Apr 28, 2023
Merging cherry-picks back to testnet branch:
-  oxen-io#2164
-  oxen-io#2134
Copy link
Collaborator

@dr7ana dr7ana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cherry-picked and merged to liblokinet-dev

@@ -352,7 +351,7 @@ namespace llarp
bool
Session::TimedOut(llarp_time_t now) const
{
if (m_State == State::Ready)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line here is the most important part of the change.

State::LinkIntro means we are finalizing the protocol handshake, we are in a kind of half established state were we are an encrypted but unauthenticated session. this is where we are sending the other end a "link intro" which identifies who we are (by snode address), including the ephemeral pubkey that we were challenged with for establishing the encryption. it's stupid af and i hate it.

State::Ready means we are authenticated and ready to exchange frames for inter node communication.

the state we hit here in this regression that one side thinks the other side is timed out when they really aren't, and are trying to finish the handshake. they can still send/recv but they just have no idea who it is they are talking to in terms of their .snode addr but the other end disagrees about where we are in that process and it gets stuck. this is because snode and client timeouts are different (because of course they are). someone wanted to add that requirement onto the pile and so it was added, even though it was something i did object to for complexity reasons.
as usual i gave up fighting on that because people have strong opinions here and i just want stuff to get done, rocking the boat is a waste of time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants