Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable hwloc and HeFFTe support in GROMACS easyblock #3531

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from

Conversation

bedroge
Copy link
Contributor

@bedroge bedroge commented Dec 13, 2024

In EESSI we noticed that GROMACS builds currently show the following with gmx -version:

Multi-GPU FFT:       none
Hwloc support:       disabled

Hwloc is part of the foss toolchain and can be easily enabled.

For Multi-GPU FFT support either cuFFTMp (https://manual.gromacs.org/documentation/current/install-guide/index.html#using-cufftmp) or HeFFTe (https://manual.gromacs.org/documentation/current/install-guide/index.html#using-heffte) is required. I was trying to add support for both, but cuFFTMp is part of NVHPC, and simply adding that as dependency will make GROMACS pick up other stuff from that installation (e.g. OpenMP libraries). Since cuFFTMp also imposes some additional requirements (see https://docs.nvidia.com/hpc-sdk/cufftmp/usage/requirements.html), I've only added HeFFTe support for now. I've also just opened an easyconfigs PR for HeFFTe with CUDA support: easybuilders/easybuild-easyconfigs#22024. Once that's merged, I'll make another to add this as a dependency to CUDA versions of GROMACS.

Copy link
Member

@ocaisa ocaisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, matches what I can find in the docs. My only concern is that there is no version-checking with the new options. I think the hwloc one is there since 2016, but the HEFFTE one is more recent (I can't quite figure it out but I think it is 2023, see https://gitlab.com/gromacs/gromacs/-/issues/4090). For our own use case we could make the check be more recent than when it was first supported.

EDIT: Indeed, heffte seems to first appear in 2023.1: https://manual.gromacs.org/2023.1/install-guide/index.html

EDIT: The option for hwloc is first documented in 2016.4: https://manual.gromacs.org/2016.4/install-guide/index.html

@bedroge
Copy link
Contributor Author

bedroge commented Dec 16, 2024

This looks good, matches what I can find in the docs. My only concern is that there is no version-checking with the new options. I think the hwloc one is there since 2016, but the HEFFTE one is more recent (I can't quite figure it out but I think it is 2023, see https://gitlab.com/gromacs/gromacs/-/issues/4090). For our own use case we could make the check be more recent than when it was first supported.

EDIT: Indeed, heffte seems to first appear in 2023.1: https://manual.gromacs.org/2023.1/install-guide/index.html

EDIT: The option for hwloc is first documented in 2016.4: https://manual.gromacs.org/2016.4/install-guide/index.html

Thanks! I added the version checks, hadn't seen your edits. But I also see hwloc being mentioned in the 2016.1 docs, and it's also in the code:
https://gitlab.com/gromacs/gromacs/-/blob/v2016.1/CMakeLists.txt?ref_type=tags#L506

HeFFTe is being mentioned in the 2023 docs (https://manual.gromacs.org/current/release-notes/2023/major/performance.html#pme-decomposition-support-with-cuda-and-sycl-backends), and also in the CMake file for 2023 (https://gitlab.com/gromacs/gromacs/-/blob/v2023/CMakeLists.txt?ref_type=tags#L741) and 2023.1 (https://gitlab.com/gromacs/gromacs/-/blob/v2023.1/CMakeLists.txt?ref_type=tags#L749).

@bedroge
Copy link
Contributor Author

bedroge commented Dec 16, 2024

One thing I was a little bit worried about is that the HeFFTe installation requires a GPU (for the tests), hence simply installing GROMACS and its dependencies will also require a GPU if we enable this by default. Should we make it optional in some way (commenting out the HeFFTe depdendency or disabling the tests)? For EESSI it would currently already cause an issue, as we build on nodes without GPUs.

@al42and
Copy link

al42and commented Jan 3, 2025

Hi!

Don't want to derail the discussion here, but, while I don't have any recent numbers, the situation has not changed much from what NVIDIA reports in their blog:

We find cuFFTMp to be up to 2x faster [than HeFFTe]

HeFFTe has the benefit of supporting AMD and Intel GPUs, but it's not the best choice for CUDA installation. cuFFTMp has its own share of issues, as @bedroge outlined in the PR description, but I think a performance difference is relevant for evaluating which effort is more worthwhile.

Regarding versioning, can confirm that heffte (and cufftmp) were added in 2023, and hwloc was added in 2016.

@bedroge
Copy link
Contributor Author

bedroge commented Jan 3, 2025

Hi!

Don't want to derail the discussion here, but, while I don't have any recent numbers, the situation has not changed much from what NVIDIA reports in their blog:

We find cuFFTMp to be up to 2x faster [than HeFFTe]

HeFFTe has the benefit of supporting AMD and Intel GPUs, but it's not the best choice for CUDA installation. cuFFTMp has its own share of issues, as @bedroge outlined in the PR description, but I think a performance difference is relevant for evaluating which effort is more worthwhile.

Regarding versioning, can confirm that heffte (and cufftmp) were added in 2023, and hwloc was added in 2016.

Thanks for your input, it's definitely a fair point. I initially added only HeFFTe support in this PR, as it seemed like a more logical default option (e.g. no additional requirements on the hardware like with cuFFTMp), and a first attempt at adding cuFFTMp support failed miserably 😅 But I can have another look at it, ultimately it would be nice if the easyblock supports both, and people can choose between the two of them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants