nrspre for large cases #261

tonyzahtila · 2021-03-29T05:52:41Z

tonyzahtila
Mar 29, 2021

Hi,

I'm interested in using nekRS for quite large cases, about 200,000 spectral elements with a 7th order polynomial.

The issue that I am having is that the case does not seem to fit on a single GPU.

The nodes I am using contain 111GB of memory, the best case scenario for nrspre is thus that I can run with 1 CPU, 1GPU and request 11GB of memory, but my large case creates an overflow issue.

The documentation for our cluster is available here, https://dashboard.hpc.unimelb.edu.au/status_specs/

Any advice appreciated.

stgeke · 2021-03-29T07:57:57Z

stgeke
Mar 29, 2021
Maintainer

What do you mean by it creates an overflow issues?
nrspre should work - it just precompiles all the jit code using a (small) dummy mesh.

2 replies

tonyzahtila Mar 29, 2021
Author

All I am changing in my case is the .re2 file.

When I change to a larger case, I get the following error:

Warning: Type mismatch in argument ‘buf’ at (1); passed REAL(8) to REAL(4) [-Wargument-mismatch]
building obj/libnek5000.a ... done
‘turbPipe.usr’ -> ‘turbPipe.f’
/usr/local/easybuild-2019/easybuild/software/compiler/gcc-cuda/8.3.0-10.1.243/openmpi/3.1.4/bin/mpif77  -O2 -cpp -fdefault-real-8 -fdefault-double-8 -std=legacy -O2 -g -DNDEBUG -DUSE_OCCA_MEM_BYTE_ALIGN=64 -mcmodel=medium -fPIC -fcray-pointer -I../../  -DPARRSB -DDPROCMAP -DMPI -DUNDERSCORE -DGLOBAL_LONG_LONG -DTIMER -I/data/gpfs/projects/punim0524/big_pipe/retau_1000_pipe_4p/.cache/nek5000 -I/home/tzahtila/.local/nekrs/nek5000/core -I./ -I /home/tzahtila/.local/nekrs/nek5000/core/experimental  -c /data/gpfs/projects/punim0524/big_pipe/retau_1000_pipe_4p/.cache/nek5000/turbPipe.f -o obj/turbPipe.o
/usr/local/easybuild-2019/easybuild/software/compiler/gcc-cuda/8.3.0-10.1.243/openmpi/3.1.4/bin/mpif77 -c  -O2 -cpp -fdefault-real-8 -fdefault-double-8 -std=legacy -O2 -g -DNDEBUG -DUSE_OCCA_MEM_BYTE_ALIGN=64 -mcmodel=medium -fPIC -fcray-pointer -I../../  -DPARRSB -DDPROCMAP -DMPI -DUNDERSCORE -DGLOBAL_LONG_LONG -DTIMER -I/data/gpfs/projects/punim0524/big_pipe/retau_1000_pipe_4p/.cache/nek5000 -I/home/tzahtila/.local/nekrs/nek5000/core -I./ -I /home/tzahtila/.local/nekrs/nek5000/core/experimental /home/tzahtila/.local/nekrs/nekInterface/nekInterface.f -o obj/nekInterface.o
/usr/local/easybuild-2019/easybuild/software/compiler/gcc-cuda/8.3.0-10.1.243/openmpi/3.1.4/bin/mpif77  -O2 -cpp -fdefault-real-8 -fdefault-double-8 -std=legacy -O2 -g -DNDEBUG -DUSE_OCCA_MEM_BYTE_ALIGN=64 -mcmodel=medium -fPIC -fcray-pointer -I../../  -DPARRSB -DDPROCMAP -DMPI -DUNDERSCORE -DGLOBAL_LONG_LONG -DTIMER -I/data/gpfs/projects/punim0524/big_pipe/retau_1000_pipe_4p/.cache/nek5000 -I/home/tzahtila/.local/nekrs/nek5000/core -I./ -I /home/tzahtila/.local/nekrs/nek5000/core/experimental -shared -o libturbPipe.so obj/turbPipe.o obj/nekInterface.o -Wl,--allow-multiple-definition -Wl,-rpath,. -Lobj -lnek5000 -ldl -Wl,-rpath,/home/tzahtila/.local/nekrs/nek5000/3rd_party/parRSB/lib -L/home/tzahtila/.local/nekrs/nek5000/3rd_party/parRSB/lib -lparRSB -Wl,-rpath,/home/tzahtila/.local/nekrs/nek5000/3rd_party/blasLapack -L/home/tzahtila/.local/nekrs/nek5000/3rd_party/blasLapack -lblasLapack -Wl,-rpath,/home/tzahtila/.local/nekrs/nek5000/3rd_party/gslib/lib -L/home/tzahtila/.local/nekrs/nek5000/3rd_party/gslib/lib -lgs
/home/tzahtila/.local/nekrs/nek5000/core/multimesh.f:798: error: relocation overflow: reference to local symbol 4 in obj/libnek5000.a(multimesh.o)
/home/tzahtila/.local/nekrs/nek5000/core/multimesh.f:799: error: relocation overflow: reference to local symbol 4 in obj/libnek5000.a(multimesh.o)

This is a similar error, I would naively say, as what you get in Nek5000, when you have a large problem and haven't specified enough MPI ranks.

tonyzahtila Mar 30, 2021
Author

Hi stgeke,

I have found what my issue is, I can run with say nrspre turbPipe 32, but not say, nrspre turbPipe 4, so there is a minimum number of MPI ranks for the job. However, I only ever use 1 CPU/1GPU to actually run nrspre.

Tony

emerzari · 2021-03-30T03:28:18Z

emerzari
Mar 30, 2021

Nrspre should really be run as: nrspre [case name] [target number of MPI ranks] It pre-compiles nek and the kernels (JIT) for the target application and MPI ranks. In my understanding, what you are seeing may be expected behavior as the case will not be able to run on 4 cores due to memory overflow in nek. It is actually compiling at the size you need for the target run you are specifying. In other words, If you plan to run this job on 32 GPUs you should use 32. Nrspre actually only runs on 1 rank, but if you read the log it says “dummy run for ## MPI ranks”. On Mar 29, 2021, at 10:35 PM, tonyzahtila ***@***.******@***.***>> wrote: Hi stgeke, I have found what my issue is, I can run with say nrspre turbPipe 32, but not say, nrspre turbPipe 4, so there is a minimum number of MPI ranks for the job. However, I only ever use 1 CPU/1GPU to actually run nrspre. Tony — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub<#261 (reply in thread)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AELXDLHHKAO2RX3MD7F36WDTGE2FRANCNFSM4Z65BIBA>.

0 replies

stgeke · 2021-03-30T06:20:46Z

stgeke
Mar 30, 2021
Maintainer

This is expected. What you specify (when running nrspre) is the number of target MPI tasks which you're going to use later on for your simulation. This number has to be large enough otherwise there is not enough memory on the host or GPU.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nrspre for large cases #261

{{title}}

Replies: 3 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

nrspre for large cases #261

tonyzahtila Mar 29, 2021

Replies: 3 comments · 2 replies

stgeke Mar 29, 2021 Maintainer

tonyzahtila Mar 29, 2021 Author

tonyzahtila Mar 30, 2021 Author

emerzari Mar 30, 2021

stgeke Mar 30, 2021 Maintainer

tonyzahtila
Mar 29, 2021

Replies: 3 comments 2 replies

stgeke
Mar 29, 2021
Maintainer

tonyzahtila Mar 29, 2021
Author

tonyzahtila Mar 30, 2021
Author

emerzari
Mar 30, 2021

stgeke
Mar 30, 2021
Maintainer