nrspre for large cases #261
tonyzahtila
started this conversation in
General
Replies: 3 comments 2 replies
-
What do you mean by it creates an overflow issues? |
Beta Was this translation helpful? Give feedback.
2 replies
-
Nrspre should really be run as:
nrspre [case name] [target number of MPI ranks]
It pre-compiles nek and the kernels (JIT) for the target application and MPI ranks. In my understanding, what you are seeing may be expected behavior as the case will not be able to run on 4 cores due to memory overflow in nek. It is actually compiling at the size you need for the target run you are specifying.
In other words, If you plan to run this job on 32 GPUs you should use 32.
Nrspre actually only runs on 1 rank, but if you read the log it says “dummy run for ## MPI ranks”.
On Mar 29, 2021, at 10:35 PM, tonyzahtila ***@***.******@***.***>> wrote:
Hi stgeke,
I have found what my issue is, I can run with say nrspre turbPipe 32, but not say, nrspre turbPipe 4, so there is a minimum number of MPI ranks for the job. However, I only ever use 1 CPU/1GPU to actually run nrspre.
Tony
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<#261 (reply in thread)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AELXDLHHKAO2RX3MD7F36WDTGE2FRANCNFSM4Z65BIBA>.
|
Beta Was this translation helpful? Give feedback.
0 replies
-
This is expected. What you specify (when running nrspre) is the number of target MPI tasks which you're going to use later on for your simulation. This number has to be large enough otherwise there is not enough memory on the host or GPU. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
I'm interested in using nekRS for quite large cases, about 200,000 spectral elements with a 7th order polynomial.
The issue that I am having is that the case does not seem to fit on a single GPU.
The nodes I am using contain 111GB of memory, the best case scenario for nrspre is thus that I can run with 1 CPU, 1GPU and request 11GB of memory, but my large case creates an overflow issue.
The documentation for our cluster is available here, https://dashboard.hpc.unimelb.edu.au/status_specs/
Any advice appreciated.
Beta Was this translation helpful? Give feedback.
All reactions