This repository contains material for the programming tasks of MasterMath course "parallel algorithms" 24/25. It is a dynamically extended tutorial that will take you from first steps (installing the Fortran compiler and libraries) to being able to write parallel programs and run them on a supercomputer yourself.
If you have never worked with git
, it may be helpful to do this basic tutorial.
In a terminal (either on your own computer or a cluster like DelftBlue), first clone the repository to get a local copy:
git clone https://github.com/jthies/par_algo2024
Whenever we announce an update via the elo course page, you need to update your local copy:
git pull
If you have local changes to files, this may result in a message:
error: cannot pull with rebase: You have unstaged changes.
error: Please commit or stash them.
The git status
command shows you what files have local changes.
You can run git stash
to move these changes out of the way for the pull.
If you then want to re-do them (after the pull
), simply type git stash pop
.
Note: You are free to commit your work to the local repository or a fork on github,
but you cannot push to the repo itself. Committing your work locally makes sure you have
the history and a backup in case you break your programs and is therefore highly recommended.
Having local commits will typically not cause conflicts with the git pull
as we only
add new files for new tasks.
We recommend that you set up a working environment on your own computer. During the course, you will also get access to the DelftBlue supercomputer, where you do not need to install additional software.
The programming language used throughout the course is Fortran 2018. It has native support for
parallel programming, so in principle all you will need is a Fortran compiler.
We recommend using the GNU compiler gfortran
. To enable actual parallel execution of your code,
it relies on the OpenCoarrays library, which you have to install separately.
To install all you need in an Ubuntu terminal, you can use:
sudo apt install build-essential libopencoarrays-openmpi-dev
On Windows, you can configure the "Windows Subsystem for Linux" WSL to use Ubuntu as Linux distribution (and this should be the default).
On other Linux systems, the package manager may be something other than apt
, on MacOS the command is called brew
and requires the Homebrew package manager to be installed.
Package names may also differ between distributions. Make sure the opencoarrays
package you install has MPI support,
here indicated by the -openmpi
in the name.
See this page for general instructions for various systems.
On the DelftBlue supercomputer, you do not need to install anything. Instead, you should load the required modules and you're good to go:
module load 2024r1 openmpi opencoarrays
Compiling a program (especially if it consists of multiple source files)
can require (a series of) commands that are hard to memorize and specific
to the programming environment. The Linux command make
allows you to put the
rules for "building" your program into a file called Makefile
, and then uses
these rules to build the program from the source code.
In this repository you will find a fully functional Makefile
geared towards
use on DelftBlue. In principle, it should also work on your own computer, though.
You find our first program in main_hello.f90
.
You can compile it using:
make main_hello.x
To run it with 4 processes, use:
mpirun -np 4 ./main_hello.x
Look at the program and output it produces. Do you notice anything unexpected?
Can you adapt the program so that all "Hello" lines are printed before the "Goodbye" lines?
The second program is slightly more complicated. It consists of two source files: dotprod.f90
and main_dotprod.f90
. The former is a Fortran module that provides several implementations
of a function to compute the inner product of two distributed vectors (coarrays). The latter contains
a program that will run these variants and print timing results.
- Look at both source files and understand (at least roughly) how this program works, and what the different variants are.
- compile and run the program "as is" using
make main_dotprod.x
mpirun -np 4 ./main_dotprod.x 1000000
You can adapt the number of processes if you have more than 4 cores available. In this example we use one million vector elements in total, you can change this as well.
You may have noticed that the gatherbcast
variant is not implemented yet.
Do this in dotprod.f90
and run the program again to test and time it.
The way the variant works is sketched in the comments of that function.
Implement at least one additional variant of your own. You can also try to generalize the "butterfly"
variant, which currently only works if the number of processes is a power of 2.
One idea for your own variant is to replace the sync all
statements (partly) by syncrhonization between two processes (sync images(...)
in Fortran).
Run the main program for the maximum number of processes available on your computer and different values of the vector length N. Which variant is the best?
The paralllel sorting algorithm (described in Section 1.8 of the book by Rob Bisseling) is implemented as sort_coarray
in the file sorting.f90
.
A corresponding driver routine is available in main_sorting.f90
. It takes a single integer as command-line argument: The (approximate)
number of elements to be assigned to each process. Your input will be rounded up to a multiple of P^2, where P is the number of processes (images)
running the program.
This is a slightly more complex program than we saw before: sort_coarray
calls other functions like quicksort
and mergeparts
. To assure that all
components work properly, it is useful to employ a test framework. Here we will use fortuno.
This is a fortran library that allows you to easily write and execute small test programs. On DelftBlue we have installed it for you, and the Makefile
has been extended with targets main_test.x
and test
. If you type make test
, the test driver is built and executed for a varying number of
processes.
Unfortunately, the co-array extension of fortuno did not work on my Ubuntu laptop. I would therefore presently not recommend installing it but trying it out on DelftBlue instead.
- Use the dot product and sorting examples to familiarize yourself with Fortran, DelftBlue job submission, make files (e.g., try switching to debugging options for your build).
- There may be bugs in the sorting program (every program always has them...). If you encounter one, you may try to add unit tests that isolate it and then fix the problem. A patch or pull request with a bug fix to this repository may put your teachers in a good mood during the final oral exam...
- Use the structure of the sorting program as inspiration for your first bigger programming task: the Sieve of Eratosthenes
Note: To run this (or any) program on DelftBlue compute nodes (rather than the login node),
use srun
instead of mpirun
. This requires a number of flags (in particular the --mem-per-cpu
flag),
that are most conveniently set in a job script. See benchmarks.slurm
for an example of a job script, and
the DelftBlue documentation for many more. For quick tests, do not use the --exclusive
flag as it will request a full node, which may make your waiting times long.