Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge master branch into merge_fkpl_collisions #162

Merged
merged 156 commits into from
Nov 26, 2023

Conversation

johnomotani
Copy link
Collaborator

No description provided.

When reloading data for a restart, it is necessary to create
`coordinate` objects using the parallelisation of the new simulation,
not the parallelisation that was used to write the data.
Only the coordinate ranges are actually different between the
parallel_io=true and parallel_io=false branches, so can take a lot of
code outside the `if parallel_io` block.
This allows a simulation to restart from a run with different resolution
in any or all dimensions, and also to restart from a run with different
moment-kinetic settings (`evolve_density`, `evolve_upar` and
`evolve_ppar`).
This is needed to get the global_io_range correct, which is needed when
reloading.
This will allow them to be reused in other tests.
`discretization_info` is the abstract type inherited by all
implementations of a discretization (i.e. `chebyshev_info` and
`finite_difference_info`).

Also adds a `finite_difference_info` that is used to dispatch to the
finite difference methods, instead of using `Bool` for that.
Length-1 dimensions have to be treated a bit specially. Derivatives in
those directions should be zero. 'Interpolation' to a grid with more
than one point: for spatial dimensions, assume the variable is constant;
for velocity dimensions, use a Maxwellian whose peak value is the value
on the length-1 grid, and whose width is 1 (in units of the reference
speed) - this means that the integral of the variable over velocity
space, after accounting for factors of pi^0.5 or pi^1.5 in the
normalization of distribution functions, is the same (up to
discretization error) after 'interpolating' as it was before.
The special "chebyshev_pseudospectral_vperp" is now handled by having a
special case for "vperp" in the "chebyshev_pseudospectral" setup, so the
default for "verp_discretization" should just be
"chebyshev_pseudospectral".
The case when vpa and vz are not the same direction needs some special
handling (for the neutrals) to convert between 1V and 3V. This is not
implemented yet, so error in this case when bzeta!=0.
Have added tests for Krook collision operator and restart interpolation.
When not using parallel I/O, should not change the irank/nrank loaded
from the original output files.

Also check whether current distributed parallelisation is the same as
the original distributed parallelisation (changing this is only
supported when using parallel I/O).
Now that the `discretization_info` abstract type is defined, can define
the specific types for each implementation in the module along with the
implementation. This makes more sense and simplifies the dependencies
between different modules.
The method for setting up MPI-capable HDF5 has changed in recent
versions of HDF5.jl. Update the docs to reflect the new method.
This makes it easier to read and to refer to, as the amount of
information in the file has increased.
Allow interpolation between grids when restarting
Add an input option `recycling_fraction` that can be set to a value
between 0 and 1 to control the fraction of the incoming ion flux at the
wall boundaries that is recycled as neutrals.
Source terms with a Maxwellian velocity profile (whose temperature is an
input option) with various options for the spatial profiles (currently
constant or Gaussian - with a tunable width - in each of the r- or
z-directions).

Includes an option for a PI controller for density.  Controller sets
amplitude of external source term to push the density (either full
profile or the midpoint value) to a fixed value.
Used when `controller_type = "recycling"` in the `neutral_source`
section.
Otherwise, if the initial temperature was decreased, the notch could be
about as wide as the distribution function, which would make things
weird.
...instead of constant_ionization_source.

Also add moment-kinetic versions of test.
The earlier-initialised values are only used during initialisation to
calculate some advection speeds whose sign is then checked to set
boundary conditions, so the differences caused by this error were small,
but previously (for evolving-upar and evolving-ppar cases) uninitialised
arrays were used, which could change randomly.
@johnomotani
Copy link
Collaborator Author

I guess the bug shouldn't make much difference in well-resolved examples, because the derivative both at vperp=0 and vperp=L should be (approximately?) zero, so replacing each with the average of the two shouldn't make much difference?

@johnomotani
Copy link
Collaborator Author

@mrhardman should i just update the expected data for the fokker_planck_time_evolution_tests.jl case?

@mrhardman
Copy link
Collaborator

mrhardman commented Nov 24, 2023

@mrhardman should i just update the expected data for the fokker_planck_time_evolution_tests.jl case?

I don't think so yet. To fully convince ourselves that all is well we should run the example case fokker_planck_relaxation.toml to a time t = 1000 / nuii and see that the Maxwellian is stable.

However, reconcile_element_boundaries_centered!() is called from derivative_elements_to_full_grid!(), from derivative!(), etc. and reconcile_element_boundaries_centered!() behaves differently if the coord.bc is "periodic" or something else in how it treats the end points. Previously vperp.bc had been set to "periodic", and I think this was incorrect.

The only place these routines were used for the vperp dimension would have been in the calculation of the boundary data in the Rosenbluth potentials. It is conceivable that the mistake could have made a small difference. The only way to check is to change the boundary condition in my PR #149 and see if this breaks the test. I suspect it will not. Let me report back.

@johnomotani
Copy link
Collaborator Author

Ah sorry, should have added: using the enforce_vperp_boundary_condition!() implementation from before bf8eda6, (which ignores vperp.bc) and setting vperp.bc = "periodic" the fokker_planck_time_evolution_tests.jl test passes with the code otherwise at the latest version of this PR (7d5d1c5).

@mrhardman
Copy link
Collaborator

Ah sorry, should have added: using the enforce_vperp_boundary_condition!() implementation from before bf8eda6, (which ignores vperp.bc) and setting vperp.bc = "periodic" the fokker_planck_time_evolution_tests.jl test passes with the code otherwise at the latest version of this PR (7d5d1c5).

Then I am confused, I thought you were arguing that the boundary conditions were imposed correctly with vperp.bc = "zero" in the current revision and that the error came from the application of the derivative! function? It sounds like you have localised the error to the enforce_verp_boundary_condition!() function?

@johnomotani
Copy link
Collaborator Author

johnomotani commented Nov 24, 2023

No, I'm saying that the updated vperp boundary condition is fine (I only reverted to the old one, because the new one would throw an error saying vperp.bc = "periodic" is not supported when vperp.bc = "periodic" is set), and that setting vperp.bc = "periodic" 'passes' the test because it reproduces the behaviour of the code that the expected data was generated with. But the old behaviour when vperp.bc = "periodic" is incorrect. I think if you set vperp.bc = "zero" (or anything except "periodic" on the #149 branch, then the fokker_planck_time_evolution_tests.jl would fail [EDIT: indeed it does fail, with the exact same errors (at least I checked the phi error) as on this branch].

@mrhardman
Copy link
Collaborator

Trying to make this investigation, I hit a post processing error:

julia --project -O3 run_post_processing.jl runs/fokker-planck-relaxation
Activating project at ~/excalibur/moment_kinetics_merge_collisions
┌ Warning: No working GUI backend found for matplotlib
└ @ PyPlot ~/.julia/packages/PyPlot/2MlrT/src/init.jl:153
ERROR: LoadError: Unsupported number of dimensions: 4
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:35
[2] read_distributed_zwallr_data!(var::Array{Float64, 4}, var_name::String, run_names::Tuple{String}, file_key::String, nblocks::Tuple{Int64}, nr_local::Int64, iskip::Int64, wallopt::String)
@ moment_kinetics.post_processing ~/excalibur/moment_kinetics_merge_collisions/src/post_processing.jl:257
[3] #10
@ ./none:0 [inlined]
[4] iterate
@ ./generator.jl:47 [inlined]
[5] collect
@ ./array.jl:782 [inlined]
[6] _totuple
@ ./tuple.jl:401 [inlined]
[7] Tuple(itr::Base.Generator{Tuple{Tuple{Array{Float64, 4}, String, Tuple{String}, String, Tuple{Int64}, Int64, Int64, String}}, moment_kinetics.post_processing.var"#10#20"{typeof(moment_kinetics.post_processing.read_distributed_zwallr_data!)}})
@ Base ./tuple.jl:369
[8] get_tuple_of_return_values(::Function, ::Tuple{Array{Float64, 4}}, ::Vararg{Any})
@ moment_kinetics.post_processing ~/excalibur/moment_kinetics_merge_collisions/src/post_processing.jl:435
[9] analyze_and_plot_data(prefix::String; run_index::Nothing)
@ moment_kinetics.post_processing ~/excalibur/moment_kinetics_merge_collisions/src/post_processing.jl:747
[10] analyze_and_plot_data(prefix::String)
@ moment_kinetics.post_processing ~/excalibur/moment_kinetics_merge_collisions/src/post_processing.jl:470
[11] top-level scope
@ ~/excalibur/moment_kinetics_merge_collisions/run_post_processing.jl:7

@johnomotani
Copy link
Collaborator Author

Trying to make this investigation, I hit a post processing error:

Fixed now.

@mrhardman
Copy link
Collaborator

@mrhardman should i just update the expected data for the fokker_planck_time_evolution_tests.jl case?

I have done the checks described above and I am happy for you to revise the test with the new vperp.bc = "zero" default boundary condition. I think this should permit the merge with PR #149.

Previous output was affected by a default `vperp_bc = "periodic"` which
had unintended effects in `reconcile_element_boundaries_centered!()`,
which is used by `derivative!()`, etc.
Only useful for debugging.
@johnomotani
Copy link
Collaborator Author

I've added a case with the collision operator to the automated debug checks. Hopefully this should help catch any shared-memory errors that we introduce in future!

This debug function is meant to emulate `get_best_ranges()`, but only
actually splitting one region type. Previously, it treated all the other
region types as serial regions, but this means that only the rank-0
process in each shared-memory block actually enters any `@loop_*` macro.
However, in `get_best_ranges()` it is only the parallelised dimensions
that get an empty range (of `1:0`) on processes that have no work to do
- for the other (non-parallelised) dimensions every process should loop
over all the points. Treating other region types as serial regions meant
that some code that was correct in non-debug mode failed in debug mode.
This commit fixes this problem, by making every process loop over all
points in the non-parallelised dimensions.
...in explicit_fokker_planck_collisions_weak_form!(). Apparently it is
necessary to have looping.loop_ranges in the correct state when wrapping
around the end of the @loop_s_r_z.
The Fokker-Planck collisions do not do any spatially-dependent
operations, so having just nelement=1, ngrid=2 in the r- and z-diretions
should be enough to check for shared-memory errors. Reducing the number
of spatial grid points from 5*5=25 to 2*2=4 massively speeds up the
debug test, making it more practical to run (especially on the CI
servers).
If no neutral speies is present, then the debug checks for combinations
of neutral dimensions (anything including `:sn`) would be no different
to running in serial, so they can be safely skipped to save time.
Github.com's Ubuntu CI servers seem to be slower than they used to be,
and are making the debug checks job time out.
@johnomotani johnomotani force-pushed the merge_fkpl_collisions-merge-master branch from 09b8e84 to c1e6c9a Compare November 26, 2023 13:27
@mrhardman
Copy link
Collaborator

I've added a case with the collision operator to the automated debug checks. Hopefully this should help catch any shared-memory errors that we introduce in future!

Nice work! This would be super useful for looking improving the parallelisation.

It looks like this test simulates a "sheath scenario". Do we want to develop a cheap check-in test along these lines (there's an example in the discussion of PR #149)?

My experiments running the sheath simulations suggest that the "low hanging fruit" of parallelising over z r s in the collision operator loop (#140) would give a speedup of up to z.ngrid * r.ngrid compared to a distributed memory run with z_nelement_local = 1 and r_nelement_local = 1, depending on the number of cores available.

@johnomotani
Copy link
Collaborator Author

It looks like this test simulates a "sheath scenario". Do we want to develop a cheap check-in test along these lines (there's an example in the discussion of PR #149)?

The input was a copy of examples/fokker-planck/fokker-planck-relaxation.toml with the resolution cut way down.

I'm always in favour of more tests, is the 1D2V simulation quick enough to run as an automated test?

@mrhardman
Copy link
Collaborator

mrhardman commented Nov 26, 2023

I'm always in favour of more tests, is the 1D2V simulation quick enough to run as an automated test?

I think yes, if we can cut down the velocity resolution and still get a physical looking solution. If we can use the same velocity resolutions as the current relaxation test, but just add a z domain, the runtime will scale roughly with the number of z points compared to the runtime of the current test. I'll make an experiment and post the plots here.

EDIT: It looks like using very reduced resolutions comparable to those in the relaxation test in sheath simulations leads to an unstable simulations in the present commit (and probably earlier ones too). Further work required.

ngrid=2 results in an out-of-bounds index error in Chebyshev
derivatives.
@johnomotani
Copy link
Collaborator Author

The debug checks are taking a super long time now. I've got a 'fix' for that (it at least gets them back down to about 2 hrs...), but I'll add that in a separate PR. I think this might as well merge into PR #149 now. @mrhardman if there are any last tweak to add, we can put them on that branch.

@johnomotani johnomotani merged commit c9a3766 into merge_fkpl_collisions Nov 26, 2023
12 of 15 checks passed
@johnomotani johnomotani deleted the merge_fkpl_collisions-merge-master branch November 26, 2023 23:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants