Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plugin: enforce max resource limits per-association #562

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

cmoussa1
Copy link
Member

@cmoussa1 cmoussa1 commented Jan 15, 2025

Problem

The priority plugin tracks resource usage across an association's set of running jobs, but it does not enforce any limits
on this resource usage in the event where an association goes over their limit.


This PR looks to add limit enforcement on an association's set of running jobs according to their resource usage, specifically with ncores and nnodes. In job.state.depend, the jobspec for a submitted job is unpacked and its job size is calculated. If the job would put the association over either their max_cores or max_nodes limit, a dependency is added to the job and it is held until a currently running job completes and frees the appropriate amount of resources.

Since the priority plugin is now enforcing two different kinds of limits (max-running-jobs and max-resources), it needs a way to keep track of which dependencies a job has since now it can have one or both. A map is added to the plugin where the key is a the ID of a held job and the value is a list of dependencies associated with the job. A number of helper functions are added to search for held jobs' dependencies and remove them from the internal map when the job no longer has any dependencies on it.

In job.state.inactive, the process of releasing a held job is reworked to check if the job satisfies all requirements in order to be released. It is first checked to see that the association is under their max-running-jobs limit and then checked to see that the held job would not put the association over either their max_cores or max_nodes limit. The dependencies for the job are removed as they are satisfied, i.e if a job would satisfy the max-running-jobs dependency but not the max-resources-dependency, only the max-running-jobs dependency is removed and the job continues to be held. Once all requirements are met, the rest of the remaining dependencies are removed from the job and it can be released to run.

To avoid becoming its own mini scheduler, the priority plugin will only look at the first held job for the association. So, if an association has two held jobs, only the first held job will be considered for release.

I've expanded the test file that originally just tracked resources across an association's set of jobs to also test checking adding and removing dependencies. Different scenarios are walked through to make sure an association's set of jobs have the correct dependencies added to it and are removed correctly, such as having a job only have a max-running-jobs or max-resources dependency added to it, have both dependencies added to it at the same time, only have one of the two dependencies removed from a held job, etc.

@cmoussa1
Copy link
Member Author

@ryanday36 when you get the chance, could you give this PR a high-level look and let me know if this looks good from your side? I think I'd mostly be curious if the way that the plugin is enforcing the resource limits looks good to you.

@cmoussa1 cmoussa1 force-pushed the enforce.resource.limits branch from ece2f0f to 2864dd6 Compare January 17, 2025 00:23
@ryanday36
Copy link
Contributor

This looks like a good approach to me @cmoussa1. I'm not sure I understand where b->curr_nodes and b->curr_cores get updated. Is that already happening when jobs start / end? or is that still to come?

@cmoussa1
Copy link
Member Author

Thanks @ryanday36. The cur_nodes and cur_cores for a job will get updated when the job enters RUN state and INACTIVE state. Sorry that wasn't made immediately clear - I added that support in #561.

@cmoussa1 cmoussa1 force-pushed the enforce.resource.limits branch 2 times, most recently from d3dccb4 to 265229c Compare January 17, 2025 19:28
@cmoussa1 cmoussa1 changed the title [WIP] plugin: enforce max resource limits per-association plugin: enforce max resource limits per-association Jan 18, 2025
@cmoussa1 cmoussa1 marked this pull request as ready for review January 18, 2025 00:51
Problem: The priority plugin tracks resource usage across an
association's set of running jobs, but it does not enforce any limits
on this resource usage in the event where an association goes over
their limit.

Add a check in job.state.depend to see if the job's resources would put
the association over *either* their max_nodes or max_cores limit. If
so, add a dependency on the job and hold it.

In job.state.inactive, rework the check to see if there are any held
jobs for the association to also ensure the held job would satisfy
all three conditions:

1) the association is under their max-running-jobs limit
2) the job would not put the association over their max nodes limit
3) the job would not put the association over their max cores limit

before fully releasing the job to be scheduled to run. Keep track of
which held jobs have which dependencies in a map within the plugin,
where the key of the map is the ID of a held job and the value is a
list of any flux-accounting-specific dependencies added to it.

The dependencies added to the held job are removed as the above
conditions are satisfied. Once all dependencies are removed from the
job, it can be released by the plugin to be scheduled to run, and the
key-value pair is removed from the internal map that keeps track of
held jobs.
Problem: t1005 and t1012 create JSON payloads to be sent to the plugin,
but these payloads don't contain max_cores or max_nodes information
per-association.

Add this information to the payloads sent to the plugin in t1005 and
t1012.
Problem: The flux-accounting testsuite doesn't have any tests for
checking dependencies on jobs when an association has hit their max
resource limits.

Add some basic tests.
@cmoussa1 cmoussa1 force-pushed the enforce.resource.limits branch from 265229c to 594c2e2 Compare January 21, 2025 17:25
@cmoussa1 cmoussa1 requested a review from grondo January 21, 2025 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature new feature plugin related to the multi-factor priority plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants