Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable ufs-weather-model on Gaea-C6 #2407

Closed
RatkoVasic-NOAA opened this issue Aug 23, 2024 · 29 comments · Fixed by #2448
Closed

Enable ufs-weather-model on Gaea-C6 #2407

RatkoVasic-NOAA opened this issue Aug 23, 2024 · 29 comments · Fixed by #2448
Assignees

Comments

@RatkoVasic-NOAA
Copy link
Collaborator

Description

Gaea C6 nodes are now available. We should enable ufs-weather-model

Solution

Enable ufs-weather-model for Gaea-C6

Alternatives

None

Related to

Depends on spack-stack installation, which is ready on C6.

@GeorgeVandenberghe-NOAA
Copy link
Collaborator

I have built and run it with my own software stack on Gaea C6. Kicked off, can no longer run by policy but this is a datapoint

@jkbk2004
Copy link
Collaborator

@BrianCurtis-NOAA Sounds a version is available at /ncrc/proj/epic/spack-stack/c6/spack-stack-1.6.0

@BrianCurtis-NOAA
Copy link
Collaborator

@BrianCurtis-NOAA Sounds a version is available at /ncrc/proj/epic/spack-stack/c6/spack-stack-1.6.0

Thanks! I'll add that path, do you know if there will be a modulefiles associated with that c6 stack?

@RatkoVasic-NOAA
Copy link
Collaborator Author

@BrianCurtis-NOAA the only current problem is that we don't have disk space on C6 machine (f6 filesystem) for staged data (regression tests...). So, yes, you can test /ncrc/proj/epic/spack-stack/c6/spack-stack-1.6.0 installation, but don't know where to stage RT data.

@RatkoVasic-NOAA
Copy link
Collaborator Author

RatkoVasic-NOAA commented Sep 12, 2024

We have ready spack-stack 1.6.0 on Gaea-C6:

/ncrc/proj/epic/spack-stack/c6/spack-stack-1.6.0/envs/unified-env/install/modulefiles/Core
/ncrc/proj/epic/spack-stack/c6/spack-stack-1.6.0/envs/gsi-addon/install/modulefiles/Core
/ncrc/proj/epic/spack-stack/c6/spack-stack-1.6.0/envs/upp-addon-env/install/modulefiles/Core
/ncrc/proj/epic/spack-stack/c6/spack-stack-1.6.0/envs/fms-2024.01/install/modulefiles/Core

Note changes comparing to Gaea-C5:

C5:
stack-intel/2023.1.0
stack-cray-mpich/8.1.25
C6:
stack-intel/2023.2.0
stack-cray-mpich/8.1.29

@AnnaSmoot-NOAA
Copy link

I'm trying to build SRW app on Gaea C6. I've made the changes to my build_gaea_intel.lua file as shown in the previous post for spack, intel, and cray-mpich, but I still get an error for intel. I see there is also:

/ncrc/proj/epic/spack-stack/latest-ue-intel/stack-intel/2023.1.0.lua

Should I be using this path? Or do you know of some other way to fix stack-intel/2023.2.0 being unknown?

I also have some other modules that are unknown:
"jasper/2.0.32" "netcdf-c/4.9.2" "wgrib2/2.0.8" "libpng/1.6.37" "cmake/3.23.1"

Does anyone have recommendations for these?

@RatkoVasic-NOAA
Copy link
Collaborator Author

@AnnaSmoot-NOAA can you please share with me your moduolefile directory, and which modulefile you are using? I can take a look.

@AnnaSmoot-NOAA
Copy link

AnnaSmoot-NOAA commented Sep 16, 2024

@RatkoVasic-NOAA Thank you for your help! I'm using this .lua at this path:

/gpfs/f6/drsa-fire2/proj-shared/Anna.Smoot/srw_app_develop/ufs-srweather-app/modulefiles/build_gaea_intel.lua

@RatkoVasic-NOAA
Copy link
Collaborator Author

I have no access to /gpfs/f6/drsa-fire2/proj-shared/
could you please copy your modulefile/ directory to /gpfs/f6/drsa-fire2/world-shared/
Thanks.

@AnnaSmoot-NOAA
Copy link

@RatkoVasic-NOAA That makes sense. It's here now:

/gpfs/f6/drsa-fire2/world-shared/Anna.Smoot/build_gaea_intel.lua

@RatkoVasic-NOAA
Copy link
Collaborator Author

@AnnaSmoot-NOAA I needed whole modulefile/ directory (or just srw_common.lua file).

@RatkoVasic-NOAA
Copy link
Collaborator Author

@AnnaSmoot-NOAA , there are two changes you need to do in build_gaea_intel.lua:

< stack_intel_ver=os.getenv("stack_intel_ver") or "2023.2.0"
---
> stack_intel_ver=os.getenv("stack_intel_ver") or "2023.1.0"
12c12
< stack_mpich_ver=os.getenv("stack_mpich_ver") or "8.1.29"
---
> stack_mpich_ver=os.getenv("stack_mpich_ver") or "8.1.25"

Try it now, and if it doesn't work, send me your srw_common.lua file.

@AnnaSmoot-NOAA
Copy link

@RatkoVasic-NOAA, I updated the build_gaea_intel.lua file and the errors changed a little:
The following module(s) are unknown: "PrgEnv-intel/8.3.3" "intel-classic/2023.1.0" "craype/2.7.20"

The updated build_gaea_intel and srw_common luas are at: /gpfs/f6/drsa-fire2/world-shared/Anna.Smoot/

@RatkoVasic-NOAA
Copy link
Collaborator Author

@AnnaSmoot-NOAA
you can take corrected build_gaea_intel.lua form here:
/ncrc/home2/Ratko.Vasic/Anna/modulefiles/build_gaea_intel.lua
That one is using old unified-env environment (should work).

If you want to use new environment (upp-addon-env), then take both files:
/ncrc/home2/Ratko.Vasic/Anna/test/modulefiles/build_gaea_intel.lua /ncrc/home2/Ratko.Vasic/Anna/test/modulefiles/srw_common.lua

@AnnaSmoot-NOAA
Copy link

@RatkoVasic-NOAA Thanks so much! I used the new environment and successfully built the SRW app.

@junwang-noaa
Copy link
Collaborator

@jkbk2004 May I ask if you have a time line when the UFS weather model will be ported to C6?

@jkbk2004
Copy link
Collaborator

@jkbk2004 May I ask if you have a time line when the UFS weather model will be ported to C6?

@junwang-noaa Internal C6 on-boarding session continues toward end of September. I think it's a reasonable to set a target migration around second week of October.

@GeorgeVandenberghe-NOAA
Copy link
Collaborator

@RatkoVasic-NOAA
Copy link
Collaborator Author

@junwang-noaa , what is missing is common disk space for C6 (for ICs and baselines RT data). We don't have yet space on /f6 file system (like we had on /f5). @ulmononian is working with admins to provide us with that space. As soon as we get it, we can enable ufs-WM on that machine. Libraries are up-to-date here.

@sanAkel
Copy link

sanAkel commented Oct 4, 2024

Hi @RatkoVasic-NOAA is there any update on ⬆️?

@sanAkel
Copy link

sanAkel commented Oct 4, 2024

Can anyone suggest if ⬇️ can be ignored:

No supported cpu target is set, CRAY_CPU_TARGET=x86-64 will be used.
Load a valid targeting module or set CRAY_CPU_TARGET

I used these modules:

you can take corrected build_gaea_intel.lua form here:
/ncrc/home2/Ratko.Vasic/Anna/modulefiles/build_gaea_intel.lua

and built:

export CMAKE_FLAGS="-DAPP=NG-GODAS"
./build.sh

My log file is at: /ncrc/home1/Santha.Akella/build.log

Thanks for any suggestions.

@RatkoVasic-NOAA
Copy link
Collaborator Author

Hi @RatkoVasic-NOAA is there any update on ⬆️?

We still have no access to C6 storage and we don't have CPU time. Once we get that, we'll start porting to this new machine.

@jkbk2004
Copy link
Collaborator

jkbk2004 commented Oct 4, 2024

@sanAkel We can follow up in #2448.

@RatkoVasic-NOAA
Copy link
Collaborator Author

@sanAkel

Gaea6>ll -d /ncrc/home1/Santha.Akella/
drwx------ 5 Santha.Akella ncep 4096 Oct 04 2024 00:41:45 /ncrc/home1/Santha.Akella/

@sanAkel
Copy link

sanAkel commented Oct 4, 2024

@sanAkel

Gaea6>ll -d /ncrc/home1/Santha.Akella/
drwx------ 5 Santha.Akella ncep 4096 Oct 04 2024 00:41:45 /ncrc/home1/Santha.Akella/

Sorry about that! Please try again. 🙏

Santha.Akella@gaea61:/ncrc/home1> pwd
/ncrc/home1
Santha.Akella@gaea61:/ncrc/home1>
Santha.Akella@gaea61:/ncrc/home1> ls -al | grep 'Santha'
drwxr-xr-x 5 Santha.Akella ncep 4096 Oct 3 20:41 Santha.Akella

@GeorgeVandenberghe-NOAA
Copy link
Collaborator

GeorgeVandenberghe-NOAA commented Oct 4, 2024 via email

@sanAkel
Copy link

sanAkel commented Oct 4, 2024

Are there any RDHPCS C6 projects that require ufs-weather-model or it supporting applications, to run.

If so are these funded and directed
projects stalled indefinitely until and unless disk space for the port and
regression tests is ever provided.?

Yes to both!

@GeorgeVandenberghe-NOAA
Copy link
Collaborator

GeorgeVandenberghe-NOAA commented Oct 4, 2024 via email

@RatkoVasic-NOAA
Copy link
Collaborator Author

#2448 ready for review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

8 participants