Skip to content

Commit

Permalink
Merge branch 'master' into machine-setup-julia-mpi-option
Browse files Browse the repository at this point in the history
  • Loading branch information
johnomotani committed Nov 8, 2024
2 parents f4c554b + 57e0d1d commit 12d3f7f
Show file tree
Hide file tree
Showing 48 changed files with 2,230 additions and 813 deletions.
38 changes: 38 additions & 0 deletions docs/src/developing.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,44 @@ typed, which could impact performance by creating code that is not 'type
stable' (i.e. all concrete types are known at compile time).


## Timings

Checking the timings of different parts of the code can be useful to check that
performance problems are not introduced. Excessive allocations can also be a
sign of type instability (or other problems) that could impact performance. To
monitor these things, `moment_kinetics` uses a `TimerOutput` object
[`moment_kinetics.timer_utils.global_timer`](@ref).

The timings and allocation counts from the rank-0 MPI process are printed to
the terminal at the end of a run. The same information is also saved to the
output file as a string for quick reference - one way to view this is
```bash
$ h5dump -d /timing_data/global_timer_string my_output_file.moments.h5
```

More detailed timing information is saved for each MPI rank into subgroups
`rank<i>` of the `timing_data` group in the output file. This information can
be plotted using [`makie_post_processing.timing_data`](@ref). The plots contain
many curves. Filtering out the ones you are not interested in (using the
`include_patterns`, `exclude_patterns`, and/or `ranks` arguments) can help, but
it still may be useful to have interactive plots which show the label and MPI
rank when you hover over a curve. For example
```julia
julia> using makie_post_processing, GLMakie
julia> ri = get_run_info("runs/my_example_run/")
julia> timing_data(ri; interactive_figs=:times);
```
Here `using GLMakie` selects the `Makie` backend that provides interactive
plots, and the `interactive_figs` argument specifies that `timing_data()`
should make an interactive plot (in this case for the execution times).

Lower level timing data, for example timing MPI and linear-algebra calls, can
be enabled by activating 'debug timing'. This can be done by re-defining the
function [`moment_kinetics.timer_utils.timeit_debug_enabled`](@ref) to return
`true` - not the most user-friendly interface (!) but this feature is probably
only needed while developing/profiling/debugging.


## Parallelization

The code is parallelized at the moment using MPI and shared-memory arrays. Arrays representing the pdf, moments, etc. are shared between all processes. Using shared memory means, for example, we can take derivatives along one dimension while parallelising the other for any dimension without having to communicate to re-distribute the arrays. Using shared memory instead of (in future as well as) distributed memory parallelism has the advantage that it is easier to split up the points within each element between processors, giving a finer-grained parallelism which should let the code use larger numbers of processors efficiently.
Expand Down
5 changes: 5 additions & 0 deletions docs/src/post_processing_notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,11 @@ julia> makie_post_process("runs/example-run1/", "runs/example-run2/", "runs/exam
What this function does is controlled by the settings in an input file, by
default `post_processing_input.toml`.

!!! note "Example input file"
You can generate an example input file, with all the options shown (with
their default values) but commented out, by running
`makie_post_processing.generate_example_input_file()`.

To run from the command line
```julia
julia --project run_makie_post_processing.jl dir1 [dir2 [dir3 ...]]
Expand Down
6 changes: 6 additions & 0 deletions docs/src/zz_timer_utils.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
`timer_utils`
=============

```@autodocs
Modules = [moment_kinetics.timer_utils]
```
20 changes: 14 additions & 6 deletions machines/shared/add_dependencies_to_project.jl
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,17 @@ end
# HDF5 setup
############

function get_hdf5_lib_names(dirname)
if Sys.isapple()
libhdf5_name = joinpath(dirname, "libhdf5.dylib")
libhdf5_hl_name = joinpath(dirname, "libhdf5_hl.dylib")
else
libhdf5_name = joinpath(dirname, "libhdf5.so")
libhdf5_hl_name = joinpath(dirname, "libhdf5_hl.so")
end
return libhdf5_name, libhdf5_hl_name
end

if mk_preferences["use_system_mpi"] == "y"
# Only need to do this if using 'system MPI'. If we are using the Julia-provided MPI,
# then the Julia-provided HDF5 is already MPI-enabled
Expand All @@ -162,14 +173,12 @@ if mk_preferences["use_system_mpi"] == "y"
if machine_settings["hdf5_library_setting"] == "system"
hdf5_dir = joinpath(ENV["HDF5_DIR"], "lib") # system hdf5
using HDF5
HDF5.API.set_libraries!(joinpath(hdf5_dir, "libhdf5.so"),
joinpath(hdf5_dir, "libhdf5_hl.so"))
HDF5.API.set_libraries!(get_hdf5_lib_names(hdf5_dir)...)
elseif machine_settings["hdf5_library_setting"] == "download"
artifact_dir = joinpath(repo_dir, "machines", "artifacts")
hdf5_dir = joinpath(artifact_dir, "hdf5-build", "lib")
using HDF5
HDF5.API.set_libraries!(joinpath(hdf5_dir, "libhdf5.so"),
joinpath(hdf5_dir, "libhdf5_hl.so"))
HDF5.API.set_libraries!(get_hdf5_lib_names(hdf5_dir)...)
elseif machine_settings["hdf5_library_setting"] == "prompt"
# Prompt user to select what HDF5 to use
if mk_preferences["build_hdf5"] == "y"
Expand All @@ -182,8 +191,7 @@ if mk_preferences["use_system_mpi"] == "y"
elseif !prompt_for_lib_paths
hdf5_dir = mk_preferences["hdf5_dir"]
if hdf5_dir != "default"
hdf5_lib = joinpath(hdf5_dir, "libhdf5.so")
hdf5_lib_hl = joinpath(hdf5_dir, "libhdf5_hl.so")
hdf5_lib, hdf5_lib_hl = get_hdf5_lib_names(hdf5_dir)
end
else
println("\n** Setting up to use system HDF5\n")
Expand Down
2 changes: 1 addition & 1 deletion machines/shared/machine_setup_stage_two.jl
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ batch_system = mk_preferences["batch_system"]
if mk_preferences["use_plots"] == "y"
python_venv_path = joinpath(repo_dir, "machines", "artifacts", "mk_venv")
activate_path = joinpath(python_venv_path, "bin", "activate")
run(`bash -c "python -m venv --system-site-packages $python_venv_path; source $activate_path; PYTHONNOUSERSITE=1 pip install matplotlib"`)
run(`bash -c "/usr/bin/env python3 -m venv --system-site-packages $python_venv_path; source $activate_path; PYTHONNOUSERSITE=1 pip install matplotlib"`)
if batch_system
open("julia.env", "a") do io
println(io, "source $activate_path")
Expand Down
2 changes: 2 additions & 0 deletions makie_post_processing/makie_post_processing/Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,9 @@ Combinatorics = "861a8166-3701-5b0c-9a16-15d98fcdc6aa"
LaTeXStrings = "b964fa9f-0449-5b57-a5c2-d3ea65f4040f"
LsqFit = "2fda8390-95c7-5789-9bda-21331edee243"
MPI = "da04e1cc-30fd-572f-bb4f-1f8673147195"
Makie = "ee78f7c6-11fb-53f2-987a-cfe4a2b5a57a"
NaNMath = "77ba4419-2d1f-58cd-9bb1-8ffee604a2e3"
OrderedCollections = "bac558e1-5e72-5ebc-8fee-abe8a469f55d"
StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
TOML = "fa267f1f-6049-4f14-aa54-33bafae1ed76"
moment_kinetics = "b5ff72cc-06fc-4161-ad14-dba1c22ed34e"
Loading

0 comments on commit 12d3f7f

Please sign in to comment.