Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[develop] [bug fix] Adding missing Intel variable for PW Azure #1167

Merged

Conversation

EdwardSnyder-NOAA
Copy link
Collaborator

@EdwardSnyder-NOAA EdwardSnyder-NOAA commented Dec 10, 2024

DESCRIPTION OF CHANGES:

The SRW App still fails on the PW Azure instance. It appears that the compute node needs to be in the same zone as the controller node. To achieve this, the compute node instance type needs to change, which is failing because of a missing intel variable. This PR adds this missing intel variable when running on Azure.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

TESTS CONDUCTED:

  • derecho.intel
  • gaea.intel
  • hera.gnu
  • hera.intel
  • hercules.intel
  • jet.intel
  • orion.intel
  • wcoss2.intel
  • NOAA Cloud (indicate which platform)
  • Azure; log file for this PR passing on the nightly Jenkins build
  • Jenkins
  • fundamental test suite
  • comprehensive tests (specify which if a subset was used)

DEPENDENCIES:

DOCUMENTATION:

ISSUE:

CHECKLIST

  • My code follows the style guidelines in the Contributor's Guide
  • I have performed a self-review of my own code using the Code Reviewer's Guide
  • I have commented my code, particularly in hard-to-understand areas
  • My changes need updates to the documentation. I have made corresponding changes to the documentation
  • My changes do not require updates to the documentation (explain).
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • Any dependent changes have been merged and published

LABELS (optional):

A Code Manager needs to add the following labels to this PR:

  • Work In Progress
  • bug
  • enhancement
  • documentation
  • release
  • high priority
  • run_ci
  • run_we2e_fundamental_tests
  • run_we2e_comprehensive_tests
  • Needs Cheyenne test
  • Needs Jet test
  • Needs Hera test
  • Needs Orion test
  • help wanted

CONTRIBUTORS (optional):

@MichaelLueken MichaelLueken added the bug Something isn't working label Dec 10, 2024
@EdwardSnyder-NOAA EdwardSnyder-NOAA marked this pull request as ready for review December 10, 2024 23:36
@MichaelLueken MichaelLueken changed the title [bug fix] Adding missing Intel variable for PW Azure [develop] [bug fix] Adding missing Intel variable for PW Azure Dec 11, 2024
Copy link
Collaborator

@MichaelLueken MichaelLueken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EdwardSnyder-NOAA -

These changes look good to me! I see from the nightly build that both the Functional WorkflowTaskTests and Test phases successfully passed.

I ran the fundamental tests on Hera and they all passed:

----------------------------------------------------------------------------------------------------
Experiment name                                                  | Status    | Core hours used 
----------------------------------------------------------------------------------------------------
grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_RAP_suite_RRFS_v1beta_2  COMPLETE              18.91
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15p2_20241  COMPLETE              11.41
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v17_p8_plot  COMPLETE              30.72
grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_HRRR_suite_HRRR_2024121  COMPLETE              56.96
grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_RAP_suite_WoFS_v0_20241211145  COMPLETE              32.22
grid_RRFS_CONUS_25km_ics_NAM_lbcs_NAM_suite_GFS_v16_2024121114570  COMPLETE              57.89
----------------------------------------------------------------------------------------------------
Total                                                              COMPLETE             208.11

Additionally, the Functional WorkflowTaskTests script was also run on Hera and successfully passed:

# Try hera with the first few simple SRW tasks ...
run_make_grid: COMPLETE
run_get_ics: COMPLETE
run_get_lbcs: COMPLETE
run_make_orog: COMPLETE
run_make_sfc_climo: COMPLETE
run_make_ics: COMPLETE
run_make_lbcs: COMPLETE
run_fcst: COMPLETE
run_post: COMPLETE

Approving this PR now.

Copy link
Collaborator

@rickgrubin-noaa rickgrubin-noaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, approved.

@MichaelLueken
Copy link
Collaborator

@EdwardSnyder-NOAA -

There are currently issues with the documentation that has nothing to do with the changes you have made. I'll need to open a PR in your fork to address a site redirect. There is also an issue with the NCEP site currently being down. I have opened PR #2 in your fork to address the site redirect. I'll wait until the issue with the NCEP site has been addressed, then I will merge this work.

[pw-azure-bug-fix] Add redirect for https://mrms.ncep.noaa.gov in doc/conf.py
@MichaelLueken
Copy link
Collaborator

After discussing with @gspetro-NOAA, I will merge this work to develop now and will check if https://www.ncep.noaa.gov properly loads in the morning.

This will lead to a red X on develop, but again, it should clear up by the morning (hopefully).

@MichaelLueken MichaelLueken merged commit 1ee3c94 into ufs-community:develop Dec 11, 2024
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants