Parallel Algorithms 2024/2025

This repository contains material for the programming tasks of MasterMath course "parallel algorithms" 24/25. It is a dynamically extended tutorial that will take you from first steps (installing the Fortran compiler and libraries) to being able to write parallel programs and run them on a supercomputer yourself.

If you have never worked with git, it may be helpful to do this basic tutorial.

Working with the repository

In a terminal (either on your own computer or a cluster like DelftBlue), first clone the repository to get a local copy:

git clone https://github.com/jthies/par_algo2024

Whenever we announce an update via the elo course page, you need to update your local copy:

git pull

If you have local changes to files, this may result in a message:

error: cannot pull with rebase: You have unstaged changes.
error: Please commit or stash them.

The git status command shows you what files have local changes. You can run git stash to move these changes out of the way for the pull. If you then want to re-do them (after the pull), simply type git stash pop.

Note: You are free to commit your work to the local repository or a fork on github, but you cannot push to the repo itself. Committing your work locally makes sure you have the history and a backup in case you break your programs and is therefore highly recommended. Having local commits will typically not cause conflicts with the git pull as we only add new files for new tasks.

0. Basic programming environment

We recommend that you set up a working environment on your own computer. During the course, you will also get access to the DelftBlue supercomputer, where you do not need to install additional software.

The programming language used throughout the course is Fortran 2018. It has native support for parallel programming, so in principle all you will need is a Fortran compiler. We recommend using the GNU compiler gfortran. To enable actual parallel execution of your code, it relies on the OpenCoarrays library, which you have to install separately.

To install all you need in an Ubuntu terminal, you can use:

sudo apt install build-essential libopencoarrays-openmpi-dev

On Windows, you can configure the "Windows Subsystem for Linux" WSL to use Ubuntu as Linux distribution (and this should be the default). On other Linux systems, the package manager may be something other than apt, on MacOS the command is called brew and requires the Homebrew package manager to be installed. Package names may also differ between distributions. Make sure the opencoarrays package you install has MPI support, here indicated by the -openmpi in the name.

See this page for general instructions for various systems.

On the DelftBlue supercomputer, you do not need to install anything. Instead, you should load the required modules and you're good to go:

module load 2024r1 openmpi opencoarrays

Makefiles

Compiling a program (especially if it consists of multiple source files) can require (a series of) commands that are hard to memorize and specific to the programming environment. The Linux command make allows you to put the rules for "building" your program into a file called Makefile, and then uses these rules to build the program from the source code.

In this repository you will find a fully functional Makefile geared towards use on DelftBlue. In principle, it should also work on your own computer, though.

1. Hello World!

You find our first program in main_hello.f90. You can compile it using:

make main_hello.x

To run it with 4 processes, use:

mpirun -np 4 ./main_hello.x

Look at the program and output it produces. Do you notice anything unexpected?

Task: re-order the output

Can you adapt the program so that all "Hello" lines are printed before the "Goodbye" lines?

2. Dot Product

The second program is slightly more complicated. It consists of two source files: dotprod.f90 and main_dotprod.f90. The former is a Fortran module that provides several implementations of a function to compute the inner product of two distributed vectors (coarrays). The latter contains a program that will run these variants and print timing results.

Look at both source files and understand (at least roughly) how this program works, and what the different variants are.
compile and run the program "as is" using

make main_dotprod.x
mpirun -np 4 ./main_dotprod.x 1000000

You can adapt the number of processes if you have more than 4 cores available. In this example we use one million vector elements in total, you can change this as well.

Task: Implement the `gatherbcast` variant

You may have noticed that the gatherbcast variant is not implemented yet. Do this in dotprod.f90 and run the program again to test and time it. The way the variant works is sketched in the comments of that function.

Task: Implement your own variant

Implement at least one additional variant of your own. You can also try to generalize the "butterfly" variant, which currently only works if the number of processes is a power of 2. One idea for your own variant is to replace the sync all statements (partly) by syncrhonization between two processes (sync images(...) in Fortran).

Run the main program for the maximum number of processes available on your computer and different values of the vector length N. Which variant is the best?

3. Parallel Sample Sort

The paralllel sorting algorithm (described in Section 1.8 of the book by Rob Bisseling) is implemented as sort_coarray in the file sorting.f90. A corresponding driver routine is available in main_sorting.f90. It takes a single integer as command-line argument: The (approximate) number of elements to be assigned to each process. Your input will be rounded up to a multiple of P^2, where P is the number of processes (images) running the program.

Unit testing

This is a slightly more complex program than we saw before: sort_coarray calls other functions like quicksort and mergeparts. To assure that all components work properly, it is useful to employ a test framework. Here we will use fortuno. This is a fortran library that allows you to easily write and execute small test programs. On DelftBlue we have installed it for you, and the Makefile has been extended with targets main_test.x and test. If you type make test, the test driver is built and executed for a varying number of processes.

Unfortunately, the co-array extension of fortuno did not work on my Ubuntu laptop. I would therefore presently not recommend installing it but trying it out on DelftBlue instead.

"Tasks"

Use the dot product and sorting examples to familiarize yourself with Fortran, DelftBlue job submission, make files (e.g., try switching to debugging options for your build).
There may be bugs in the sorting program (every program always has them...). If you encounter one, you may try to add unit tests that isolate it and then fix the problem. A patch or pull request with a bug fix to this repository may put your teachers in a good mood during the final oral exam...
Use the structure of the sorting program as inspiration for your first bigger programming task: the Sieve of Eratosthenes

Note: To run this (or any) program on DelftBlue compute nodes (rather than the login node), use srun instead of mpirun. This requires a number of flags (in particular the --mem-per-cpu flag), that are most conveniently set in a job script. See benchmarks.slurm for an example of a job script, and the DelftBlue documentation for many more. For quick tests, do not use the --exclusive flag as it will request a full node, which may make your waiting times long.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
benchmarks.f90		benchmarks.f90
benchmarks.slurm		benchmarks.slurm
dotprod.f90		dotprod.f90
main_benchmarks.f90		main_benchmarks.f90
main_dotprod.f90		main_dotprod.f90
main_hello.f90		main_hello.f90
main_primes.f90		main_primes.f90
main_sorting.f90		main_sorting.f90
main_tests.f90		main_tests.f90
prime_sieve.slurm		prime_sieve.slurm
primes.f90		primes.f90
sorting.f90		sorting.f90
sorting.slurm		sorting.slurm
unit_tests.f90		unit_tests.f90

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Parallel Algorithms 2024/2025

Working with the repository

0. Basic programming environment

Makefiles

1. Hello World!

Task: re-order the output

2. Dot Product

Task: Implement the `gatherbcast` variant

Task: Implement your own variant

3. Parallel Sample Sort

Unit testing

"Tasks"

About

Releases

Packages

Contributors 2

Languages

License

jthies/par_algo2024

Folders and files

Latest commit

History

Repository files navigation

Parallel Algorithms 2024/2025

Working with the repository

0. Basic programming environment

Makefiles

1. Hello World!

Task: re-order the output

2. Dot Product

Task: Implement the gatherbcast variant

Task: Implement your own variant

3. Parallel Sample Sort

Unit testing

"Tasks"

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Task: Implement the `gatherbcast` variant

Packages