Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default GPU node replicas to 1 avoiding 0 nodes in SNO clusters #2167

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

bdattoma
Copy link
Contributor

@bdattoma bdattoma commented Jan 10, 2025

When provisioning a GPU worker node in addition to the single node in a SNO OpenShift cluster, the MachineSet defaults to 0 because it copies the existing worker node machine set.

(Azure kustomization already has the patch)

PR validation:

  1. SNO 4.17 on AWS with 1 NVIDIA GPU: rhoai-test-flow/2282 OK - the tests in the job failed because RHOAI wasn't installed (misconfig, my bad)
  2. SNO 4.17 on GCP with 1 NVIDIA GPU: rhoai-test-flow/2286 FAIL - found an issue and fixed. Retry: rhoai-test-flow/2289 OK
  3. SNO 4.16 on IBM Cloud with 1 NVIDIA GPU: soon

Copy link

openshift-ci bot commented Jan 10, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: bdattoma

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@bdattoma bdattoma self-assigned this Jan 10, 2025
@bdattoma bdattoma added needs testing Needs to be tested in Jenkins enhancements Bugfixes, enhancements, refactoring, ... in tests or libraries (PR will be listed in release-notes) labels Jan 10, 2025
Copy link
Contributor

Robot Results

✅ Passed ❌ Failed ⏭️ Skipped Total Pass %
594 0 0 594 100

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancements Bugfixes, enhancements, refactoring, ... in tests or libraries (PR will be listed in release-notes) needs testing Needs to be tested in Jenkins
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant