Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate compilation in BYU supercomputer #15

Open
EdoAlvarezR opened this issue May 1, 2024 · 6 comments
Open

Automate compilation in BYU supercomputer #15

EdoAlvarezR opened this issue May 1, 2024 · 6 comments

Comments

@EdoAlvarezR
Copy link
Collaborator

EdoAlvarezR commented May 1, 2024

I tried running build.sh in the login node with the supercomputer lines uncommented and passing the FLOWVPM tests in the login node. However, the tests failed in the node.

I then tried compiling directly inside the node, running into the same error (something along the lines of C++ violating bounds, Illegal instruction). After much testing, I got it to work by compiling in the login in as follows:

  1. Load dependencies
module load julia/1.6
module load gcc/10
module load openmpi/4.1
  1. Manually copy/paste all lines in build.sh except for the make command.
  2. Manually compile:
MPI_LIB=$(dirname $(which mpicxx))/../lib

mpicxx -DHAVE_CONFIG_H -DJULIA_ENABLE_THREADING -Dhello_EXPORTS -I/home/edoalvar/.julia/artifacts/61238de4948b5e57c436ccdfea5e7ffcda1913b2/include -I/zapps7/julia/1.9.3/include/julia -Wunused-parameter -Wextra -Wreorder -std=gnu++1z -O3 -DNDEBUG -fPIC -march=broadwell  -I. -I..  -DEXAFMM_WITH_OPENMP  -msse3 -mavx -mavx2 -DNDEBUG -DEXAFMM_EAGER  -funroll-loops -fabi-version=6 -Wfatal-errors -fopenmp  -g -O2  -MT fmm-fmm.o -MD -MP -MF .deps/fmm-fmm.Tpo -c -o fmm-fmm.o `test -f 'fmm.cxx' || echo './'`fmm.cxx

mpicxx -funroll-loops -fabi-version=6 -Wfatal-errors -fopenmp  -g -O2    -o fmm fmm-fmm.o   -L/home/edoalvar/.julia/artifacts/61238de4948b5e57c436ccdfea5e7ffcda1913b2/include/../lib -lcxxwrap_julia -fPIC -march=native -Wunused-parameter -Wextra -Wreorder -std=gnu++1z -O3 -DNDEBUG  -shared -Wl,-rpath,$MPI_LIB: -L/home/edoalvar/.julia/artifacts/61238de4948b5e57c436ccdfea5e7ffcda1913b2/include/../lib -lcxxwrap_julia  -L/zapps7/julia/1.9.3/include/julia/../../lib -ljulia
  1. Finish running the rest of build.sh

I think that what made the difference was to use -march=broadwell in the first compile but -march=native with the shared library

@EdoAlvarezR
Copy link
Collaborator Author

I got the same Illegal instruction error trying to import GeoIO from the node, so it might not be related to FLOWExaFMM

@EdoAlvarezR
Copy link
Collaborator Author

EdoAlvarezR commented May 2, 2024

Attempted to install Julia v1.10.3 and compile from login node:

module load gcc/10
module load openmpi/4.1

JULIA_DETAILS=$(~/Programs/julia-1.10.3/bin/julia --project=~/environments/flowvpm_actuatordisk202405-2 --version)
CXXWRAP_DETAILS=$(~/Programs/julia-1.10.3/bin/julia --project=~/environments/flowvpm_actuatordisk202405-2 --print 'import Pkg; Pkg.status("CxxWrap")')

# --------------- USER INPUTS --------------------------------------------------
# JULIA_H must point to the directory that contains julia.h
# NOTE: You can find this by typing `abspath(Sys.BINDIR, Base.INCLUDEDIR)` in
#       the Julia REPL
JULIA_H=$(~/Programs/julia-1.10.3/bin/julia --project=~/environments/flowvpm_actuatordisk202405-2 --print "abspath(Sys.BINDIR, Base.INCLUDEDIR)")
JULIA_H=${JULIA_H%\"}; JULIA_H=${JULIA_H#\"}; JULIA_H=$JULIA_H/julia

# JLCXX_H must point to the directory that contains jlcxx/jlcxx.hpp from CxxWrap
# NOTE: You can find this by typing `CxxWrap.prefix_path()` in the Julia REPL
JLCXX_H=$(~/Programs/julia-1.10.3/bin/julia --project=~/environments/flowvpm_actuatordisk202405-2 --print "import CxxWrap; CxxWrap.prefix_path()")
JLCXX_H=${JLCXX_H%\"}; JLCXX_H=${JLCXX_H#\"}; JLCXX_H=$JLCXX_H/include

# Julia_LIB must point to the directory that contains libjulia.so.x
JULIA_LIB=$JULIA_H/../../lib

# JLCXX_LIB must point to the directory that contains libcxxwrap_julia.so.0.x.x
JLCXX_LIB=$JLCXX_H/../lib

# --------------- COMPILE CODE -------------------------------------------------
THIS_DIR=$(pwd)
SRC_DIR=deps
COMPILE_DIR=build
SAVE_DIR=src

echo "Removing existing build"
rm -rf $COMPILE_DIR
rm -f $SAVE_DIR/fmm.so

echo "Copying files"
mkdir $COMPILE_DIR
cp -r $SRC_DIR/* $COMPILE_DIR/

echo "Configuring build"
cd $COMPILE_DIR/
./configure
# ./configure --enable-single

echo "Compiling 3d"
cd 3d
# make JULIA_H=$JULIA_H JLCXX_H=$JLCXX_H JULIA_LIB=$JULIA_LIB JLCXX_LIB=$JLCXX_LIB

# -ffast-math might be faster, but not safe in some architectures
make JULIA_H=$JULIA_H JLCXX_H=$JLCXX_H JULIA_LIB=$JULIA_LIB JLCXX_LIB=$JLCXX_LIB EXTRAOBJFLAGS=-ffast-math

cd $THIS_DIR
cp $COMPILE_DIR/3d/fmm $SAVE_DIR/fmm.so

echo -e "\nDone!"

echo -e "\nCompile Summary:"
echo $JULIA_DETAILS
echo $CXXWRAP_DETAILS | sed 's/.*C/C/'

# ---      UNCOMMENT THIS SECTION FOR        ---
# --- ADDITIONAL STEPS FOR BYU SUPERCOMPUTER ---
rm -f $COMPILE_DIR/3d/fmm $SAVE_DIR/fmm.so

cd $COMPILE_DIR/3d/

# libmpi.so often causes trouble and has to be included with fmm
MPI_LIB=$(dirname $(which mpicxx))/../lib

# Compiler flags with debugging
# mpicxx -ffast-math -funroll-loops -fabi-version=6 -Wfatal-errors -fopenmp -g -O2 -o fmm fmm-fmm.o -L$JLCXX_LIB -lcxxwrap_julia -fPIC -march=broadwell -Wunused-parameter -Wextra -Wreorder -std=gnu++1z -O3 -DNDEBUG -shared -Wl,-rpath,$MPI_LIB: -LJLCXX_LIB -lcxxwrap_julia -L$JULIA_LIB -ljulia

# Compiler flags without debugging
mpicxx -ffast-math -funroll-loops -fabi-version=6 -fopenmp -O2 -o fmm fmm-fmm.o -L$JLCXX_LIB -lcxxwrap_julia -fPIC -march=broadwell -std=gnu++1z -O3 -shared -Wl,-rpath,$MPI_LIB: -LJLCXX_LIB -lcxxwrap_julia -L$JULIA_LIB -ljulia

cd $THIS_DIR
cp $COMPILE_DIR/3d/fmm $SAVE_DIR/fmm.so

# Testing FLOWExaFMM import
~/Programs/julia-1.10.3/bin/julia --project=~/environments/flowvpm_actuatordisk202405-2 -e "import FLOWExaFMM" && echo "FLOWExaFMM installation successful!"

@EdoAlvarezR
Copy link
Collaborator Author

EdoAlvarezR commented May 2, 2024

That didn't work, so I'm running this instead:

module load gcc/10
module load openmpi/4.1

JULIA_DETAILS=$(~/Programs/julia-1.10.3/bin/julia --project=~/environments/flowvpm_actuatordisk202405-2 --version)
CXXWRAP_DETAILS=$(~/Programs/julia-1.10.3/bin/julia --project=~/environments/flowvpm_actuatordisk202405-2 --print 'import Pkg; Pkg.status("CxxWrap")')

# --------------- USER INPUTS --------------------------------------------------
# JULIA_H must point to the directory that contains julia.h
# NOTE: You can find this by typing `abspath(Sys.BINDIR, Base.INCLUDEDIR)` in
#       the Julia REPL
JULIA_H=$(~/Programs/julia-1.10.3/bin/julia --project=~/environments/flowvpm_actuatordisk202405-2 --print "abspath(Sys.BINDIR, Base.INCLUDEDIR)")
JULIA_H=${JULIA_H%\"}; JULIA_H=${JULIA_H#\"}; JULIA_H=$JULIA_H/julia

# JLCXX_H must point to the directory that contains jlcxx/jlcxx.hpp from CxxWrap
# NOTE: You can find this by typing `CxxWrap.prefix_path()` in the Julia REPL
JLCXX_H=$(~/Programs/julia-1.10.3/bin/julia --project=~/environments/flowvpm_actuatordisk202405-2 --print "import CxxWrap; CxxWrap.prefix_path()")
JLCXX_H=${JLCXX_H%\"}; JLCXX_H=${JLCXX_H#\"}; JLCXX_H=$JLCXX_H/include

# Julia_LIB must point to the directory that contains libjulia.so.x
JULIA_LIB=$JULIA_H/../../lib

# JLCXX_LIB must point to the directory that contains libcxxwrap_julia.so.0.x.x
JLCXX_LIB=$JLCXX_H/../lib

# --------------- COMPILE CODE -------------------------------------------------
THIS_DIR=$(pwd)
SRC_DIR=deps
COMPILE_DIR=build
SAVE_DIR=src

echo "Removing existing build"
rm -rf $COMPILE_DIR
rm -f $SAVE_DIR/fmm.so

echo "Copying files"
mkdir $COMPILE_DIR
cp -r $SRC_DIR/* $COMPILE_DIR/

echo "Configuring build"
cd $COMPILE_DIR/
./configure
# ./configure --enable-single

echo "Compiling 3d"
cd 3d

MPI_LIB=$(dirname $(which mpicxx))/../lib

mpicxx -DHAVE_CONFIG_H -DJULIA_ENABLE_THREADING -Dhello_EXPORTS -I/home/edoalvar/.julia/artifacts/c129d84767ca7fe64514b3789c623e1203355949/include -I/home/edoalvar/Programs/julia-1.10.3/include/julia -Wunused-parameter -Wextra -Wreorder -std=gnu++1z -O3 -DNDEBUG -fPIC -march=broadwell -ffast-math  -I. -I..  -DEXAFMM_WITH_OPENMP  -msse3 -mavx -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -DNDEBUG -DEXAFMM_EAGER  -funroll-loops -fabi-version=6 -Wfatal-errors -fopenmp  -g -O2  -MT fmm-fmm.o -MD -MP -MF .deps/fmm-fmm.Tpo -c -o fmm-fmm.o `test -f 'fmm.cxx' || echo './'`fmm.cxx

mpicxx -funroll-loops -fabi-version=6 -Wfatal-errors -fopenmp  -g -O2    -o fmm fmm-fmm.o   -L/home/edoalvar/.julia/artifacts/c129d84767ca7fe64514b3789c623e1203355949/include/../lib -lcxxwrap_julia -fPIC -march=native -Wunused-parameter -Wextra -Wreorder -std=gnu++1z -O3 -DNDEBUG  -shared -Wl,-rpath,$MPI_LIB: -L/home/edoalvar/.julia/artifacts/c129d84767ca7fe64514b3789c623e1203355949/include/../lib -lcxxwrap_julia  -L/home/edoalvar/Programs/julia-1.10.3/include/julia/../../lib -ljulia


cd $THIS_DIR
cp $COMPILE_DIR/3d/fmm $SAVE_DIR/fmm.so

echo -e "\nDone!"

Launch Julia:

~/Programs/julia-1.10.3/bin/julia --project=~/environments/flowvpm_actuatordisk202405-2

and then testing like this

import FLOWVPM as vpm
this_is_a_test=true
include(joinpath(vpm.examples_path, "vortexrings", "run_leapfrog.jl"))

@EdoAlvarezR
Copy link
Collaborator Author

EdoAlvarezR commented May 2, 2024

Trying to compile inside a m12-4-34 node:

MPI_LIB=$(dirname $(which mpicxx))/../lib

mpicxx -DHAVE_CONFIG_H -DJULIA_ENABLE_THREADING -Dhello_EXPORTS -I/home/edoalvar/.julia/artifacts/c129d84767ca7fe64514b3789c623e1203355949/include -I/home/edoalvar/Programs/julia-1.10.3/include/julia -Wunused-parameter -Wextra -Wreorder -std=gnu++1z -O3 -DNDEBUG -fPIC -march=broadwell -I. -I..  -DEXAFMM_WITH_OPENMP  -msse3 -mavx -mavx2 -DNDEBUG -DEXAFMM_EAGER -ffast-math -funroll-loops -fabi-version=6 -Wfatal-errors -fopenmp  -g -O2  -MT fmm-fmm.o -MD -MP -MF .deps/fmm-fmm.Tpo -c -o fmm-fmm.o `test -f 'fmm.cxx' || echo './'`fmm.cxx

mpicxx -funroll-loops -fabi-version=6 -Wfatal-errors -fopenmp  -g -O2    -o fmm fmm-fmm.o   -L/home/edoalvar/.julia/artifacts/c129d84767ca7fe64514b3789c623e1203355949/include/../lib -lcxxwrap_julia -fPIC -march=native  -Wunused-parameter -Wextra -Wreorder -std=gnu++1z -O3 -DNDEBUG  -shared -Wl,-rpath,$MPI_LIB: -L/home/edoalvar/.julia/artifacts/c129d84767ca7fe64514b3789c623e1203355949/include/../lib -lcxxwrap_julia  -L/home/edoalvar/Programs/julia-1.10.3/include/julia/../../lib -ljulia

@EdoAlvarezR
Copy link
Collaborator Author

EdoAlvarezR commented May 2, 2024

Bingo! That seems to have worked. In summary, the key was to compile inside the node. Here is the command I ran:

module load gcc/10
module load openmpi/4.1

JULIA_DETAILS=$(~/Programs/julia-1.10.3/bin/julia --project=~/environments/flowvpm_actuatordisk202405-2 --version)
CXXWRAP_DETAILS=$(~/Programs/julia-1.10.3/bin/julia --project=~/environments/flowvpm_actuatordisk202405-2 --print 'import Pkg; Pkg.status("CxxWrap")')

# --------------- USER INPUTS --------------------------------------------------
# JULIA_H must point to the directory that contains julia.h
# NOTE: You can find this by typing `abspath(Sys.BINDIR, Base.INCLUDEDIR)` in
#       the Julia REPL
JULIA_H=$(~/Programs/julia-1.10.3/bin/julia --project=~/environments/flowvpm_actuatordisk202405-2 --print "abspath(Sys.BINDIR, Base.INCLUDEDIR)")
JULIA_H=${JULIA_H%\"}; JULIA_H=${JULIA_H#\"}; JULIA_H=$JULIA_H/julia

# JLCXX_H must point to the directory that contains jlcxx/jlcxx.hpp from CxxWrap
# NOTE: You can find this by typing `CxxWrap.prefix_path()` in the Julia REPL
JLCXX_H=$(~/Programs/julia-1.10.3/bin/julia --project=~/environments/flowvpm_actuatordisk202405-2 --print "import CxxWrap; CxxWrap.prefix_path()")
JLCXX_H=${JLCXX_H%\"}; JLCXX_H=${JLCXX_H#\"}; JLCXX_H=$JLCXX_H/include

# Julia_LIB must point to the directory that contains libjulia.so.x
JULIA_LIB=$JULIA_H/../../lib

# JLCXX_LIB must point to the directory that contains libcxxwrap_julia.so.0.x.x
JLCXX_LIB=$JLCXX_H/../lib

# --------------- COMPILE CODE -------------------------------------------------
THIS_DIR=$(pwd)
SRC_DIR=deps
COMPILE_DIR=build
SAVE_DIR=src

echo "Removing existing build"
rm -rf $COMPILE_DIR
rm -f $SAVE_DIR/fmm.so

echo "Copying files"
mkdir $COMPILE_DIR
cp -r $SRC_DIR/* $COMPILE_DIR/

echo "Configuring build"
cd $COMPILE_DIR/
./configure
# ./configure --enable-single

echo "Compiling 3d"
cd 3d

MPI_LIB=$(dirname $(which mpicxx))/../lib

mpicxx -DHAVE_CONFIG_H -DJULIA_ENABLE_THREADING -Dhello_EXPORTS -I/home/edoalvar/.julia/artifacts/c129d84767ca7fe64514b3789c623e1203355949/include -I/home/edoalvar/Programs/julia-1.10.3/include/julia -Wunused-parameter -Wextra -Wreorder -std=gnu++1z -O3 -DNDEBUG -fPIC -march=broadwell -I. -I..  -DEXAFMM_WITH_OPENMP  -msse3 -mavx -mavx2 -DNDEBUG -DEXAFMM_EAGER -ffast-math -funroll-loops -fabi-version=6 -Wfatal-errors -fopenmp  -g -O2  -MT fmm-fmm.o -MD -MP -MF .deps/fmm-fmm.Tpo -c -o fmm-fmm.o `test -f 'fmm.cxx' || echo './'`fmm.cxx

mpicxx -funroll-loops -fabi-version=6 -Wfatal-errors -fopenmp  -g -O2    -o fmm fmm-fmm.o   -L/home/edoalvar/.julia/artifacts/c129d84767ca7fe64514b3789c623e1203355949/include/../lib -lcxxwrap_julia -fPIC -march=native  -Wunused-parameter -Wextra -Wreorder -std=gnu++1z -O3 -DNDEBUG  -shared -Wl,-rpath,$MPI_LIB: -L/home/edoalvar/.julia/artifacts/c129d84767ca7fe64514b3789c623e1203355949/include/../lib -lcxxwrap_julia  -L/home/edoalvar/Programs/julia-1.10.3/include/julia/../../lib -ljulia


cd $THIS_DIR
cp $COMPILE_DIR/3d/fmm $SAVE_DIR/fmm.so

echo -e "\nDone!"

Launch Julia:

~/Programs/julia-1.10.3/bin/julia --project=~/environments/flowvpm_actuatordisk202405-2

and then testing like this

import FLOWVPM as vpm
this_is_a_test=true
include(joinpath(vpm.examples_path, "vortexrings", "run_leapfrog.jl"))

@EdoAlvarezR
Copy link
Collaborator Author

EdoAlvarezR commented Nov 20, 2024

I am trying to compile in Julia v1.11.1 inside node m12-2-13 as follows

module load gcc/10
module load openmpi/4.1

# alias julia='~/Programs/julia-1.10.3/bin/julia --project=/home/edoalvar/environments/flowvpm_actuatordisk202411'
alias julia='~/Programs/julia-1.11.1/bin/julia --project=/home/edoalvar/environments/flowvpm_actuatordisk202411'

JULIA_DETAILS=$(julia --version)
CXXWRAP_DETAILS=$(julia --print 'import Pkg; Pkg.status("CxxWrap")')

# --------------- USER INPUTS --------------------------------------------------
# JULIA_H must point to the directory that contains julia.h
# NOTE: You can find this by typing `abspath(Sys.BINDIR, Base.INCLUDEDIR)` in
#       the Julia REPL
JULIA_H=$(julia --print "abspath(Sys.BINDIR, Base.INCLUDEDIR)")
JULIA_H=${JULIA_H%\"}; JULIA_H=${JULIA_H#\"}; JULIA_H=$JULIA_H/julia

# JLCXX_H must point to the directory that contains jlcxx/jlcxx.hpp from CxxWrap
# NOTE: You can find this by typing `CxxWrap.prefix_path()` in the Julia REPL
JLCXX_H=$(julia --print "import CxxWrap; CxxWrap.prefix_path()")
JLCXX_H=${JLCXX_H%\"}; JLCXX_H=${JLCXX_H#\"}; JLCXX_H=$JLCXX_H/include

# Julia_LIB must point to the directory that contains libjulia.so.x
JULIA_LIB=$JULIA_H/../../lib

# JLCXX_LIB must point to the directory that contains libcxxwrap_julia.so.0.x.x
JLCXX_LIB=$JLCXX_H/../lib

# --------------- COMPILE CODE -------------------------------------------------
THIS_DIR=$(pwd)
SRC_DIR=deps
COMPILE_DIR=build
SAVE_DIR=src

echo "Removing existing build"
rm -rf $COMPILE_DIR
rm -f $SAVE_DIR/fmm.so

echo "Copying files"
mkdir $COMPILE_DIR
cp -r $SRC_DIR/* $COMPILE_DIR/

echo "Configuring build"
cd $COMPILE_DIR/
./configure
# ./configure --enable-single

echo "Compiling 3d"
cd 3d

MPI_LIB=$(dirname $(which mpicxx))/../lib

mpicxx -DHAVE_CONFIG_H -DJULIA_ENABLE_THREADING -Dhello_EXPORTS -I${JLCXX_H} -I${JULIA_H} -Wunused-parameter -Wextra -Wreorder -std=gnu++1z -O3 -DNDEBUG -fPIC -march=broadwell -I. -I..  -DEXAFMM_WITH_OPENMP  -msse3 -mavx -mavx2 -DNDEBUG -DEXAFMM_EAGER -ffast-math -funroll-loops -fabi-version=6 -Wfatal-errors -fopenmp  -g -O2  -MT fmm-fmm.o -MD -MP -MF .deps/fmm-fmm.Tpo -c -o fmm-fmm.o `test -f 'fmm.cxx' || echo './'`fmm.cxx

mpicxx -funroll-loops -fabi-version=6 -Wfatal-errors -fopenmp  -g -O2    -o fmm fmm-fmm.o   -L${JLCXX_H}/../lib -lcxxwrap_julia -fPIC -march=native  -Wunused-parameter -Wextra -Wreorder -std=gnu++1z -O3 -DNDEBUG  -shared -Wl,-rpath,${MPI_LIB}: -L${JLCXX_H}/../lib -lcxxwrap_julia  -L${JULIA_LIB} -ljulia


cd $THIS_DIR
cp $COMPILE_DIR/3d/fmm $SAVE_DIR/fmm.so

echo -e "\nDone!"

Launch Julia:

~/Programs/julia-1.11.1/bin/julia --project=~/environments/flowvpm_actuatordisk202411

but then I get Caught signal 4 (Illegal instruction: illegal operand) when I import FLOWExaFMM

Failed to precompile FLOWExaFMM [a07d1f4e-0e34-4d8b-bfef-e5b961477d34] to "/home/edoalvar/.julia/compiled/v1.11/FLOWExaFMM/jl_zNtHkm".
[m12-2-13:3676230:0:3676230] Caught signal 4 (Illegal instruction: illegal operand)
==== backtrace (tid:3676230) ====
 0  /apps/spack/root/opt/spack/linux-rhel9-haswell/gcc-13.2.0/ucx-1.16.0-fnew7ji6a45kbbjzpl4kjs3jdrkfpndf/lib/libucs.so.0(ucs_handle_error+0x2b4) [0x7f25a9f25d54]
 1  /apps/spack/root/opt/spack/linux-rhel9-haswell/gcc-13.2.0/ucx-1.16.0-fnew7ji6a45kbbjzpl4kjs3jdrkfpndf/lib/libucs.so.0(+0x33f14) [0x7f25a9f25f14]
 2  /apps/spack/root/opt/spack/linux-rhel9-haswell/gcc-13.2.0/ucx-1.16.0-fnew7ji6a45kbbjzpl4kjs3jdrkfpndf/lib/libucs.so.0(+0x342ca) [0x7f25a9f262ca]
 3  /home/edoalvar/.julia/compiled/v1.11/CxxWrap/WGIJU_zCByi.so(+0x4cf84) [0x7f25aa7e2f84]
=================================
Invalid instruction at 0x7f25aa7e2f84: 0x62, 0xf1, 0xfd, 0x18, 0xd4, 0x04, 0x06, 0xc5, 0xfa, 0x7f, 0x43, 0x20, 0x48, 0x39, 0x4b

It is worth to note that this compile script worked just fine for Julia v1.10.3, but fails for v1.11.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant