Skip to content

Commit

Permalink
bugfix/flux-nodes (#484)
Browse files Browse the repository at this point in the history
* remove a merge conflict statement that was missed

* fix flux node allocation issue

* allow for vars to be used with nodes settings of workers/batch

* add tests for var usage with nodes

* update CHANGELOG

* run fix-style
  • Loading branch information
bgunnar5 authored Jun 10, 2024
1 parent 35c2039 commit 0f6bebf
Show file tree
Hide file tree
Showing 5 changed files with 25 additions and 2 deletions.
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Applying filters for `merlin detailed-status` will now log debug statements instead of warnings
- Modified the unit tests for the `merlin status` command to use pytest rather than unittest
- Added fixtures for `merlin status` tests that copy the workspace to a temporary directory so you can see exactly what's run in a test
- Batch block and workers now allow for variables to be used in node settings

### Fixed
- Bugfix for output of `merlin example openfoam_wf_singularity`
- A bug with the CHANGELOG detection test when the target branch isn't in the ci runner history
- Link to Merlin banner in readme
- Issue with escape sequences in ascii art (caught by python 3.12)
- Bug where Flux wasn't identifying total number of nodes on an allocation


## [1.12.1]
Expand Down
10 changes: 8 additions & 2 deletions merlin/spec/merlinspec.json
Original file line number Diff line number Diff line change
Expand Up @@ -221,7 +221,8 @@
"nodes": {
"anyOf": [
{"type": "null"},
{"type": "integer", "minimum": 1}
{"type": "integer", "minimum": 1},
{"type": "string", "pattern": "^\\$\\(\\w+\\)$"}
]
},
"batch": {
Expand Down Expand Up @@ -279,7 +280,12 @@
"launch_pre": {"type": "string", "minLength": 1},
"launch_args": {"type": "string", "minLength": 1},
"worker_launch": {"type": "string", "minLength": 1},
"nodes": {"type": "integer", "minimum": 1},
"nodes": {
"anyOf": [
{"type": "integer", "minimum": 1},
{"type": "string","pattern": "^\\$\\(\\w+\\)$"}
]
},
"walltime": {
"anyOf": [
{"type": "string", "minLength": 1},
Expand Down
8 changes: 8 additions & 0 deletions merlin/study/batch.py
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,14 @@ def get_node_count(default=1):
the environment cannot be determined.
:param returns: (int) The number of nodes to use.
"""

# If flux is the scheduler, we can get the size of the allocation with this
try:
get_size_proc = subprocess.run("flux getattr size", shell=True, capture_output=True, text=True)
return int(get_size_proc.stdout)
except Exception:
pass

if "SLURM_JOB_NUM_NODES" in os.environ:
return int(os.environ["SLURM_JOB_NUM_NODES"])

Expand Down
5 changes: 5 additions & 0 deletions tests/integration/definitions.py
Original file line number Diff line number Diff line change
Expand Up @@ -295,6 +295,11 @@ def define_tests(): # pylint: disable=R0914,R0915
"conditions": [HasReturnCode(), HasRegex(r"default_worker", negate=True)],
"run type": "local",
},
"run-workers echo variable for worker nodes": {
"cmds": f"{workers_flux} {flux_native} --echo",
"conditions": [HasReturnCode(), HasRegex(r"-N 4")],
"run type": "local",
},
}
wf_format_tests = {
"local minimum_format": {
Expand Down
2 changes: 2 additions & 0 deletions tests/integration/test_specs/flux_par_native_test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ env:
OUTPUT_PATH: ./studies
N_SAMPLES: 10
SCRIPTS: $(SPECROOT)/../../../merlin/examples/workflows/flux/scripts
WORKER_NODES: 4

study:
- description: Build the code
Expand Down Expand Up @@ -71,6 +72,7 @@ merlin:
simworkers:
args: -l INFO --concurrency 1 --prefetch-multiplier 1 -Ofair
steps: [runs, data]
nodes: $(WORKER_NODES)
samples:
column_labels: [V1, V2]
file: $(MERLIN_INFO)/samples.npy
Expand Down

0 comments on commit 0f6bebf

Please sign in to comment.