Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify Trusted Cluster role_map #48330

Closed
bernardjkim opened this issue Nov 1, 2024 · 9 comments · Fixed by #48481
Closed

Modify Trusted Cluster role_map #48330

bernardjkim opened this issue Nov 1, 2024 · 9 comments · Fixed by #48481
Assignees
Labels
bug test-plan-problem Issues which have been surfaced by running the manual release test plan trusted-cluster

Comments

@bernardjkim
Copy link
Contributor

Expected behavior:

Teleport Trusted Cluster role mappings should be modifiable multiple times.

Current behavior:

The Trusted Cluster resource can only be modified once. Additional attempts to modify the resource result in the following error:

$ tctl create --force trusted_cluster.yaml
ERROR: trusted cluster "..." and/or one or more of its cert authorities have been modified, please retry

Bug details:

  • Teleport version: 17.0.0-alpha.2
  • Recreation steps: Modify the trusted_cluster resource and change the spec.role_map.local value. Then try to change the value back to what is was before.
  • Workaround: Restarting the leaf cluster Teleport service allows me to modify the trusted_cluster resource once, again.
  • Related Issue: This feature was previously addressed in Allow Trusted Cluster role mappings to be updated #10568.
@bernardjkim bernardjkim added bug test-plan-problem Issues which have been surfaced by running the manual release test plan labels Nov 1, 2024
@bernardjkim bernardjkim self-assigned this Nov 1, 2024
@zmb3
Copy link
Collaborator

zmb3 commented Nov 2, 2024

Duplicate of #48309?

@bernardjkim
Copy link
Contributor Author

It sounds like the issues may have the same root cause, but I'm not familiar enough with the implementation yet to say. @hugoShaka what do you think?

@hugoShaka
Copy link
Contributor

hugoShaka commented Nov 4, 2024

Have you tried re-fretching the resource to get the up-to-date revision ID?

This error happens when the conditional update fails, likely because of the revision ID mismatch. I think this is the expected behaviour for any resource implementing the conditional update via revision ID properly.

The error message might not be actionable enough though. Also, maybe --force should do an unconditional update? It would be less safe though.

@hugoShaka
Copy link
Contributor

@espadolini and @rosstimothy would you expect tctl create --force to do an unconditional update? to fetch the last revision ID and try to do an update? To honour the revision ID form the YAML manifest?

@rosstimothy
Copy link
Contributor

@espadolini and @rosstimothy would you expect tctl create --force to do an unconditional update? to fetch the last revision ID and try to do an update? To honour the revision ID form the YAML manifest?

I'd expect that to do an upsert and overwrite any existing state.

@bernardjkim
Copy link
Contributor Author

bernardjkim commented Nov 4, 2024

Have you tried re-fretching the resource to get the up-to-date revision ID?

Ah, I think this was my mistake. Let me double check.

Edit: Hmm, actually I don't think this was the issue. I remember seeing this error in the WebUI before too, and it looks like the WebUI should automatically refresh the revision ID.

However, now that I'm testing it today, I'm not able to repro the issue. I'm not sure how I ended in that state. If I can't repro the issue I'll go ahead and close the issue.

@bernardjkim
Copy link
Contributor Author

bernardjkim commented Nov 4, 2024

Okay, so I wasn't able to repro this issue on 17.0.0-alpha.2. I was testing this before on a slightly newer dev build that I built: v17.0.0-alpha.2...v17.0.0-dev.bernard.10. Suspect #48009

I'll test again when 17.0.0-alpha.3 is released to check if we've introduced a regression since 17.0.0-alpha.2.

Edit: Yeah, I'm able to repro the issue now on 17.0.0-alpha.4

@bernardjkim
Copy link
Contributor Author

Hey @fspmarshall I think #48009 introduced a regression.

@bernardjkim
Copy link
Contributor Author

bernardjkim commented Nov 5, 2024

We did some investigating and suspect that this issue may be due to a preexisting issue causing revisions between the Teleport cache and backend to fall out of sync. We found that disabling the cache would fix the issue. We also verified that at some point the sqlite backend and cached values become out of sync.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug test-plan-problem Issues which have been surfaced by running the manual release test plan trusted-cluster
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants