Make all processes use identical FFTW plans #203

johnomotani · 2024-04-15T15:37:46Z

When using flags=FFTW.MEASURE, FFTW does some run-time tests to choose the fastest version of its algorithm to use. This can lead to different processes using slightly different algorithms - these should only differ by machine-precision-level rounding errors but (surprisingly!) this may be important. This PR updates so that the FFTW plans are generated on the root process, then written out to an 'FFTW wisdom' file which is read by all other processes, ensuring that all processes use exactly the same algorithm.

While working on the kinetic electrons, I came across a very strange bug. A simulation that would run correctly on a small number of processes (e.g. 8, with the z-direction split into 8 blocks using distributed-memory, so each shared-memory block only has one process), but would fail on a large number of processes (64 or 128, so each shared memory block has 8 or 16 processes). While trying to debug, I found out that FFTW can give slightly different results between runs when using FFTW_MEASURE. To aid comparing results between runs, I switched to FFTW.ESTIMATE (which is the same every time) and the bug went away!
I don't understand why this fixed the bug. The different FFTW algorithms should only be different by machine-precision level rounding errors, so I don't understand how this can cause a numerical instability. My only guesses are that either the slight inconsistency does somehow make a difference, or that when doing the run-time testing somehow multiple instances of FFTW conflict with each other and subtly corrupt something, eventually leading (after several thousand pseudo-timesteps!) to a failure to converge.
Anyway, making this 'fix' so that the run-time testing is done only on one process, then passed to all the others so that they all use the exact same algorithm, that bug has gone away, so I think this fix is useful.

I'm not 100% sure the problem is entirely fixed yet, because even with this fix the simulation I'm testing fails to converge, although at a much later time...

Useful to ensure that any function that needs to write into `output_dir` is able to do so.

mrhardman · 2024-04-16T07:56:05Z

A much simpler solution could be to use the matrix formulation for the Chebyshev derivatives, which wouldn't require an FFT at all. This exists for the moment only for the Gauss Lobatto elements (the Radau elements use the FFT to compute the matrix), but this could be fixed. You could try using the Gauss-Legendre grid for your physics problem to test if the FFT is in any way connected to the loss of convergence of the simulation. If 1D interpolation is required for the Gauss-Legendre grid to be usable, I can provide that routine after a short discussion to determine where it is needed.

...to ensure all processes have identical FFTW plans for consistency.

Makes it more convenient to change to, for example, FFTW.ESTIMATE if that is desired. No run-time option to change the FFTW flag is provided because there is not a convenient struct to put the option in (although that could be done without a lot of effort).

johnomotani added the bug Something isn't working label Apr 15, 2024

johnomotani force-pushed the fftw-plan-consistency branch from 5a3f5fa to 2c744a9 Compare April 15, 2024 17:14

Create output_dir at the earliest possible point

39641b1

Useful to ensure that any function that needs to write into `output_dir` is able to do so.

johnomotani force-pushed the fftw-plan-consistency branch from 2c744a9 to 81daf5b Compare April 15, 2024 21:53

johnomotani added 2 commits April 17, 2024 10:20

Save/load FFTW wisdom from rank-0

9d6c9d9

...to ensure all processes have identical FFTW plans for consistency.

johnomotani force-pushed the fftw-plan-consistency branch from 81daf5b to b3c5b65 Compare April 17, 2024 09:21

johnomotani merged commit d15d648 into master Apr 23, 2024
17 checks passed

johnomotani deleted the fftw-plan-consistency branch April 23, 2024 20:30

johnomotani mentioned this pull request Apr 24, 2024

Feature gyroaverages #187

Merged

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make all processes use identical FFTW plans #203

Make all processes use identical FFTW plans #203

johnomotani commented Apr 15, 2024

mrhardman commented Apr 16, 2024

Make all processes use identical FFTW plans #203

Make all processes use identical FFTW plans #203

Conversation

johnomotani commented Apr 15, 2024

mrhardman commented Apr 16, 2024