-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge master branch into merge_fkpl_collisions #162
Merge master branch into merge_fkpl_collisions #162
Conversation
When reloading data for a restart, it is necessary to create `coordinate` objects using the parallelisation of the new simulation, not the parallelisation that was used to write the data.
Only the coordinate ranges are actually different between the parallel_io=true and parallel_io=false branches, so can take a lot of code outside the `if parallel_io` block.
This allows a simulation to restart from a run with different resolution in any or all dimensions, and also to restart from a run with different moment-kinetic settings (`evolve_density`, `evolve_upar` and `evolve_ppar`).
This is needed to get the global_io_range correct, which is needed when reloading.
This will allow them to be reused in other tests.
`discretization_info` is the abstract type inherited by all implementations of a discretization (i.e. `chebyshev_info` and `finite_difference_info`). Also adds a `finite_difference_info` that is used to dispatch to the finite difference methods, instead of using `Bool` for that.
Length-1 dimensions have to be treated a bit specially. Derivatives in those directions should be zero. 'Interpolation' to a grid with more than one point: for spatial dimensions, assume the variable is constant; for velocity dimensions, use a Maxwellian whose peak value is the value on the length-1 grid, and whose width is 1 (in units of the reference speed) - this means that the integral of the variable over velocity space, after accounting for factors of pi^0.5 or pi^1.5 in the normalization of distribution functions, is the same (up to discretization error) after 'interpolating' as it was before.
The special "chebyshev_pseudospectral_vperp" is now handled by having a special case for "vperp" in the "chebyshev_pseudospectral" setup, so the default for "verp_discretization" should just be "chebyshev_pseudospectral".
The case when vpa and vz are not the same direction needs some special handling (for the neutrals) to convert between 1V and 3V. This is not implemented yet, so error in this case when bzeta!=0.
Have added tests for Krook collision operator and restart interpolation.
When not using parallel I/O, should not change the irank/nrank loaded from the original output files. Also check whether current distributed parallelisation is the same as the original distributed parallelisation (changing this is only supported when using parallel I/O).
Now that the `discretization_info` abstract type is defined, can define the specific types for each implementation in the module along with the implementation. This makes more sense and simplifies the dependencies between different modules.
The method for setting up MPI-capable HDF5 has changed in recent versions of HDF5.jl. Update the docs to reflect the new method.
This makes it easier to read and to refer to, as the amount of information in the file has increased.
Allow interpolation between grids when restarting
Add an input option `recycling_fraction` that can be set to a value between 0 and 1 to control the fraction of the incoming ion flux at the wall boundaries that is recycled as neutrals.
Source terms with a Maxwellian velocity profile (whose temperature is an input option) with various options for the spatial profiles (currently constant or Gaussian - with a tunable width - in each of the r- or z-directions). Includes an option for a PI controller for density. Controller sets amplitude of external source term to push the density (either full profile or the midpoint value) to a fixed value.
Used when `controller_type = "recycling"` in the `neutral_source` section.
Otherwise, if the initial temperature was decreased, the notch could be about as wide as the distribution function, which would make things weird.
...instead of constant_ionization_source. Also add moment-kinetic versions of test.
The earlier-initialised values are only used during initialisation to calculate some advection speeds whose sign is then checked to set boundary conditions, so the differences caused by this error were small, but previously (for evolving-upar and evolving-ppar cases) uninitialised arrays were used, which could change randomly.
I guess the bug shouldn't make much difference in well-resolved examples, because the derivative both at |
@mrhardman should i just update the expected data for the |
I don't think so yet. To fully convince ourselves that all is well we should run the example case fokker_planck_relaxation.toml to a time t = 1000 / nuii and see that the Maxwellian is stable.
The only place these routines were used for the vperp dimension would have been in the calculation of the boundary data in the Rosenbluth potentials. It is conceivable that the mistake could have made a small difference. The only way to check is to change the boundary condition in my PR #149 and see if this breaks the test. I suspect it will not. Let me report back. |
Then I am confused, I thought you were arguing that the boundary conditions were imposed correctly with vperp.bc = "zero" in the current revision and that the error came from the application of the derivative! function? It sounds like you have localised the error to the enforce_verp_boundary_condition!() function? |
No, I'm saying that the updated vperp boundary condition is fine (I only reverted to the old one, because the new one would throw an error saying |
Trying to make this investigation, I hit a post processing error: julia --project -O3 run_post_processing.jl runs/fokker-planck-relaxation |
Fixed now. |
I have done the checks described above and I am happy for you to revise the test with the new vperp.bc = "zero" default boundary condition. I think this should permit the merge with PR #149. |
Previous output was affected by a default `vperp_bc = "periodic"` which had unintended effects in `reconcile_element_boundaries_centered!()`, which is used by `derivative!()`, etc.
Only useful for debugging.
I've added a case with the collision operator to the automated debug checks. Hopefully this should help catch any shared-memory errors that we introduce in future! |
This debug function is meant to emulate `get_best_ranges()`, but only actually splitting one region type. Previously, it treated all the other region types as serial regions, but this means that only the rank-0 process in each shared-memory block actually enters any `@loop_*` macro. However, in `get_best_ranges()` it is only the parallelised dimensions that get an empty range (of `1:0`) on processes that have no work to do - for the other (non-parallelised) dimensions every process should loop over all the points. Treating other region types as serial regions meant that some code that was correct in non-debug mode failed in debug mode. This commit fixes this problem, by making every process loop over all points in the non-parallelised dimensions.
...in explicit_fokker_planck_collisions_weak_form!(). Apparently it is necessary to have looping.loop_ranges in the correct state when wrapping around the end of the @loop_s_r_z.
The Fokker-Planck collisions do not do any spatially-dependent operations, so having just nelement=1, ngrid=2 in the r- and z-diretions should be enough to check for shared-memory errors. Reducing the number of spatial grid points from 5*5=25 to 2*2=4 massively speeds up the debug test, making it more practical to run (especially on the CI servers).
If no neutral speies is present, then the debug checks for combinations of neutral dimensions (anything including `:sn`) would be no different to running in serial, so they can be safely skipped to save time.
Github.com's Ubuntu CI servers seem to be slower than they used to be, and are making the debug checks job time out.
09b8e84
to
c1e6c9a
Compare
Nice work! This would be super useful for looking improving the parallelisation. It looks like this test simulates a "sheath scenario". Do we want to develop a cheap check-in test along these lines (there's an example in the discussion of PR #149)? My experiments running the sheath simulations suggest that the "low hanging fruit" of parallelising over z r s in the collision operator loop (#140) would give a speedup of up to z.ngrid * r.ngrid compared to a distributed memory run with z_nelement_local = 1 and r_nelement_local = 1, depending on the number of cores available. |
The input was a copy of I'm always in favour of more tests, is the 1D2V simulation quick enough to run as an automated test? |
I think yes, if we can cut down the velocity resolution and still get a physical looking solution. If we can use the same velocity resolutions as the current relaxation test, but just add a z domain, the runtime will scale roughly with the number of z points compared to the runtime of the current test. I'll make an experiment and post the plots here. EDIT: It looks like using very reduced resolutions comparable to those in the relaxation test in sheath simulations leads to an unstable simulations in the present commit (and probably earlier ones too). Further work required. |
ngrid=2 results in an out-of-bounds index error in Chebyshev derivatives.
The debug checks are taking a super long time now. I've got a 'fix' for that (it at least gets them back down to about 2 hrs...), but I'll add that in a separate PR. I think this might as well merge into PR #149 now. @mrhardman if there are any last tweak to add, we can put them on that branch. |
No description provided.