Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow assigning resources to nodes with finite negative scores #3802

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

kgaillot
Copy link
Contributor

No description provided.

... in pcmk__node_available(), to make it easier to add new conditions.

I went through the callers to see if any others should reject guest
nodes with unrunnable guests, and it appears not.
... now that the "alive" and "usable" checks are separated
This allows resources to be active in cases where they would previously be
stopped.

Fixes T335
Now that resources may be assigned to nodes with finite negative scores,
two scheduler regression tests have improvements:

In systemhealthp2, the stonith resource may now start, whereas before it was
left stopped. There are two nodes, hs21c (with health status yellow, equivalent
to -100 preference) and hs21d (which is unseen, so unclean and offline).
There was no good reason to leave stonith stopped.

In node-maintenance-1, rsc1 stays active where it is, whereas before it was
stopped. rsc1 is started on node1 (where it has a -1 location preference), and
rsc2 is started on node2 (where it has a -1 location preference), and node2 is
unmanaged. Since node2 is unmanaged, rsc2 can't be moved away from it, and rsc1
has nowhere to move to even though it has a negative preference for its current
node. Previously, rsc1 would be stopped even though it couldn't be recovered
elsewhere.
All existing callers retain the same behavior
@kgaillot
Copy link
Contributor Author

@nrwahl2 , could you review this? Thanks

This was something I had started but had to put aside for higher priority stuff. I went back and finished it but it needs careful consideration and review. It's a pretty big behavioral change but it can be considered a fix.

Previously, clone children were sorted in pcmk__clone_create_probe().
However, the sort was only necessary when probing anonymous clones on
nodes that wouldn't have an active instance, so now they are sorted only
in that case, for a slight efficiency gain.
@kgaillot
Copy link
Contributor Author

The last commit is unrelated. I only glanced over the regression test changes but they appear to just be sorting differences as expected. Since the clone children are in a list, not a hash table, I expect the unsorted order to remain consistent so that regression test results don't vary, but if that does happen, this is the likely culprit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant