Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compile error on "ucs_memory_type_t" #13017

Closed
ahaichen opened this issue Jan 4, 2025 · 8 comments · May be fixed by #13028
Closed

compile error on "ucs_memory_type_t" #13017

ahaichen opened this issue Jan 4, 2025 · 8 comments · May be fixed by #13028

Comments

@ahaichen
Copy link

ahaichen commented Jan 4, 2025

Please submit all the information below so that we can understand the working environment that is the context for your question.

Background information

What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)

v4.1.7

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

from a source

If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.

download from official website

Please describe the system on which you are running

Linux version 3.10.0-1160.el7.x86_64
Intel(R) Xeon(R) Gold 5122 CPU


Details of the problem

when compile with gcc 12,

../configure CC=gcc CXX=g++ --prefix=...

I get the follow errors:

../../../../../oshmem/mca/sshmem/ucx/sshmem_ucx_module.c:99:41: error: unknown type name 'ucs_memory_type_t'
99 | unsigned flags, ucs_memory_type_t mem_type, int err_level)
| ^~~~~~~~~~~~~~~~~
../../../../../oshmem/mca/sshmem/ucx/sshmem_ucx_module.c: In function 'segment_create':
../../../../../oshmem/mca/sshmem/ucx/sshmem_ucx_module.c:205:12: warning: implicit declaration of function 'segment_create_internal' [-Wimplicit-function-declaration]
205 | return segment_create_internal(ds_buf, mca_sshmem_base_start_address, size,
| ^~~~~~~~~~~~~~~~~~~~~~~
../../../../../oshmem/mca/sshmem/ucx/sshmem_ucx_module.c:206:43: error: 'UCS_MEMORY_TYPE_HOST' undeclared (first use in this function)
206 | flags, UCS_MEMORY_TYPE_HOST, 0);

Thanks in advance ~

@bosilca
Copy link
Member

bosilca commented Jan 4, 2025

UCS_MEMORY_TYPE_HOST has been around starting from August 2019. This means either a very old version of UCX has been found or we are missing an include header in the shmem part. The second option seems unlikely, we build the shmem very regularly.

What version of UCX did configure find ?

@ahaichen
Copy link
Author

ahaichen commented Jan 5, 2025

the configure log has the following records :

configure:97712: checking for UCX version compatibility
configure:97726: gcc -c -O3 -DNDEBUG -finline-functions -fno-strict-aliasing -mcx16 -pthread -I/public/home/chenah1/Download/openmpi-4.1.7/build/opal/mca/hwloc/hwloc201/hwloc/include -I/public/home/chenah1/Download/openmpi-4.1.7/opal/mca/hwloc/hwloc201/hwloc/include conftest.c >&5
configure:97726: $? = 0
configure:97733: result: yes
configure:99382: checking UCX version
configure:99398: gcc -E -I/public/home/chenah1/Download/openmpi-4.1.7/build/opal/mca/hwloc/hwloc201/hwloc/include -I/public/home/chenah1/Download/openmpi-4.1.7/opal/mca/hwloc/hwloc201/hwloc/include conftest.c
configure:99398: $? = 0
configure:99399: result: ok (not 1.8.x)

btw, I also got similar issue when compile the openMPI 5.x.

Thanks a lot ~

UCS_MEMORY_TYPE_HOST has been around starting from August 2019. This means either a very old version of UCX has been found or we are missing an include header in the shmem part. The second option seems unlikely, we build the shmem very regularly.

What version of UCX did configure find ?

@ahaichen
Copy link
Author

ahaichen commented Jan 5, 2025

It has resolved by using new version of UCX. Thanks a lot !

@jsquyres jsquyres closed this as completed Jan 5, 2025
@bosilca
Copy link
Member

bosilca commented Jan 6, 2025

If you can find what version of UCX you had before I will fix the configure to avoid such issues in the future.

@bosilca
Copy link
Member

bosilca commented Jan 6, 2025

According to OMPI setup m4 script we require at least UCX 1.9 to accept building with UCX. I just checked and 1.9 contains the ucs_memory_type_t type and UCS_MEMORY_TYPE_HOST. Thus, as far as I can see the older version should have been detected and not taken into account. I wonder what happened in your case and how did we failed to correctly identify an older version.

@bosilca
Copy link
Member

bosilca commented Jan 7, 2025

I think I found an issue with the UCX version check code. Can you please try the following patch with the old library (assuming you still have it around).

diff --git a/config/ompi_check_ucx.m4 b/config/ompi_check_ucx.m4
index 01e39aaf96..8d336c7028 100644
--- a/config/ompi_check_ucx.m4
+++ b/config/ompi_check_ucx.m4
@@ -75,14 +75,16 @@ AC_DEFUN([OMPI_CHECK_UCX],[
            AS_IF([test "${ompi_check_ucx_cv_have_version_1_8}" = "yes"],
                  [AC_MSG_WARN([UCX support skipped because version 1.8.x was found, which has a known catastrophic issue.])
                   ompi_check_ucx_happy=no])])
-           AC_PREPROC_IFELSE([AC_LANG_PROGRAM([[
+    AC_PREPROC_IFELSE([AC_LANG_PROGRAM([[
 #include <ucp/api/ucp_version.h>
                              ]], [[
 #if (UCP_API_MAJOR < 1) || ((UCP_API_MAJOR == 1) && (UCP_API_MINOR < 9))
 #error "Version too low"
 #endif
                              ]])],
-                             [], [AC_MSG_WARN([UCX version is too old, please upgrade to 1.9 or higher.])])
+                             [],
+                             [AC_MSG_WARN([UCX version is too old, please upgrade to 1.9 or higher.])
+                              ompi_check_ucx_happy=no])
 
     AS_IF([test "$ompi_check_ucx_happy" = yes],
           [AC_CHECK_DECLS([ucp_tag_send_nbr],

@bosilca bosilca reopened this Jan 7, 2025
@ahaichen
Copy link
Author

ahaichen commented Jan 7, 2025

I think I found an issue with the UCX version check code. Can you please try the following patch with the old library (assuming you still have it around).

diff --git a/config/ompi_check_ucx.m4 b/config/ompi_check_ucx.m4
index 01e39aaf96..8d336c7028 100644
--- a/config/ompi_check_ucx.m4
+++ b/config/ompi_check_ucx.m4
@@ -75,14 +75,16 @@ AC_DEFUN([OMPI_CHECK_UCX],[
            AS_IF([test "${ompi_check_ucx_cv_have_version_1_8}" = "yes"],
                  [AC_MSG_WARN([UCX support skipped because version 1.8.x was found, which has a known catastrophic issue.])
                   ompi_check_ucx_happy=no])])
-           AC_PREPROC_IFELSE([AC_LANG_PROGRAM([[
+    AC_PREPROC_IFELSE([AC_LANG_PROGRAM([[
 #include <ucp/api/ucp_version.h>
                              ]], [[
 #if (UCP_API_MAJOR < 1) || ((UCP_API_MAJOR == 1) && (UCP_API_MINOR < 9))
 #error "Version too low"
 #endif
                              ]])],
-                             [], [AC_MSG_WARN([UCX version is too old, please upgrade to 1.9 or higher.])])
+                             [],
+                             [AC_MSG_WARN([UCX version is too old, please upgrade to 1.9 or higher.])
+                              ompi_check_ucx_happy=no])
 
     AS_IF([test "$ompi_check_ucx_happy" = yes],
           [AC_CHECK_DECLS([ucp_tag_send_nbr],

The UCX version I used was ucx-1.17.0, is looks like the version is fine. Once I compile with the tag "--with-cma", it works fine. I am wondering if this tag makes any difference. Thanks a lot ~

@bosilca
Copy link
Member

bosilca commented Jan 9, 2025

I tried to see if --with-cma has any impact but without success. It looks as if some old headers were pulled in, but as I can't reproduce I will assume it was a fluke. I will therefore close this issue and will create a PR to prevent accepting UCX < 1.9.

@bosilca bosilca closed this as completed Jan 9, 2025
bosilca added a commit to bosilca/ompi that referenced this issue Jan 9, 2025
Don't just print a warning if the UCX version is too old, bail out and
demand a newer version (at least 1.9, now more than 4 years old).

Fixes open-mpi#13017.

Signed-off-by: George Bosilca <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants