Running on multiple nodes with GPUs #568

makrandak · 2024-06-22T00:33:26Z

makrandak
Jun 22, 2024

Hello,

I am trying to run a multinode interactive job with nekrs. I am on the latest commit of the next branch.

In consultation with the local HPC support, I am using prun instead of mpirun with the number of nodes = no of GPUs = no of MPI tasks to run GPU aware MPI tasks. prun figures out the correct srun command for the setup depending on the number of GPUs, no of MPI tasks and the number of nodes.

The modules I am loading are

Currently Loaded Modules:
  1) gnu12/12.3.0   2) mpi/2021.12   3) impi/2021.12   4) cuda/12.4   5) cmake/3.24.2   6) prun/2.2   7) ucx/1.15.0

When I run the case as prun <executable> --setup hit.par

I get the following initial output

[prun] Master compute host = yellowstone-gpu-1-1
[prun] Resource manager = slurm
[prun] Launch cmd = srun --mpi=pmi2 /home/mani/khanwale/.local/nekrs_next/bin/nekrs --setup hit.par (family=impi)
reading hit.par

lot of setup happens and then the code crashes at

generating mesh ...
loading mesh from nek ... Nelements: 8000, NboundaryIDs: 0, NboundaryFaces: 0 done (0.000663064s)
polynomial order N: 7, over-integration order cubN: 9
meshParallelGatherScatterSetup N=7
autotuning gs for wordSize=8 nFields=1
local: 1.4393e-04s (188.9GB/s)
pack/unpack host + hostBuffer MPI using pw: 4.8415e-04s
pack/unpack device + hostBuffer MPI using pw: 4.2688e-04s
pack/unpack device + hostBuffer MPI using nbc: 4.1799e-04s
pack/unpack device + deviceBuffer MPI using pw:srun: error: yellowstone-gpu-1-1: task 0: Segmentation fault (core dumped)
srun: error: yellowstone-gpu-1-2: task 1: Segmentation fault (core dumped)

Again, I am not sure if I am missing setting some environment variables. The same setup works on 1 node (1GPU) with the same command.

I would really appreciate any insight or help. If there is more insight into the nature of this error, and if it is a setup problem, it will also help me communicate with local HPC support.

Thank you very much for your help.

With sincere regards,
Makrand

Answered by yslan

Jun 22, 2024

Based on the location, it seems to fail at gs-setup and it's trying to test the GPU-aware MPI.

You might need to ask the support to know which MPI support GPU-aware MPI and how to use it.
For example, MPICH might need MPICH_GPU_SUPPORT_ENABLED=1

Given that you only have single GPU per node, maybe the NIC is connected via CPU and GPU-MPI won't be needed.
You can turn it off in NekRS with export NEKRS_GPU_MPI=0

View full answer

yslan · 2024-06-22T04:25:42Z

yslan
Jun 22, 2024

Based on the location, it seems to fail at gs-setup and it's trying to test the GPU-aware MPI.

You might need to ask the support to know which MPI support GPU-aware MPI and how to use it.
For example, MPICH might need MPICH_GPU_SUPPORT_ENABLED=1

Given that you only have single GPU per node, maybe the NIC is connected via CPU and GPU-MPI won't be needed.
You can turn it off in NekRS with export NEKRS_GPU_MPI=0

4 replies

makrandak Jun 22, 2024
Author

export NEKRS_GPU_MPI=0 worked. Thanks for your help mate.

With sincere regards,
Makrand

stgeke Jun 22, 2024
Maintainer

Note that disabling GPU-aware MPI might degrade performance.

makrandak Jun 22, 2024
Author

I learnt that the correct flag that needs to be set with Intel MPI is export I_MPI_OFFLOAD=1, then there is no need to disable NEKRS_GPU_MPI.

Thanks for the warning and help.

With sincere regards,
Makrand

yslan Jun 23, 2024

As a reference for others, I also recommend to checkout MPI documentations regarding to GPU-aware MPI. For example, OpenMPI has this and this, and one can figure out that on a Nvidia DGX, the following might be needed

# Suppress fork warning
# https://www.open-mpi.org/faq/?category=openfabrics#ofa-fork
export OMPI_MCA_mpi_warn_on_fork=0

# Use CUDA buffer in PSM2 Omni-Path networking library
export PSM2_CUDA=1

# GPU direct
export PSM2_GPUDIRECT=1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running on multiple nodes with GPUs #568

{{title}}

Replies: 1 comment 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Running on multiple nodes with GPUs #568

makrandak Jun 22, 2024

Replies: 1 comment · 4 replies

yslan Jun 22, 2024

makrandak Jun 22, 2024 Author

stgeke Jun 22, 2024 Maintainer

makrandak Jun 22, 2024 Author

yslan Jun 23, 2024

makrandak
Jun 22, 2024

Replies: 1 comment 4 replies

yslan
Jun 22, 2024

makrandak Jun 22, 2024
Author

stgeke Jun 22, 2024
Maintainer

makrandak Jun 22, 2024
Author