Immitating key features of MC/DC, this code repository is created to test implementations of the selected metaprogramming libraries (Numba,) that are investigated in MC/DC-TNT. This strategically helps achieving smooth integration of proposed abstraction ideas into MC/DC.
A particular goal of this repo is to demonstrate a working Python-based implementation, that supports abstractions in running mode (pure Python or Numba), MC algorithm (history-based or event-based), and kernel threading target (CPUs or GPUs). This is achieved by innovative uses of Python decorator and meta-classes, which adapt pure Python, scalar, history-based kernels into the desired running mode, MC algorithm, and threading target.
Achievements so far:
- Pure Python (history-based and event-based; only on CPU; useful for algorithm debugging)
- Numba history-based and event-based on CPU (serial)
- Numba event-based on GPU (unperformant)
TODO list:
- GPU reduction on global/small tally (in this test code, neutron leakage). This may require designing a new adapter type.
- Mesh tally. To implement the use of GPU atomics.
- GPU exclusive scan for thread syncing and reproducibility. This completes branching-event adapter and allows running Numba event-based on GPU (but only with branchless collision).
- GPU sorting for efficient particle bank memory access.
- GPU adapter for multiplying events (such as fission and weight window). This allows running Numba event-based on GPU without branchless collision.
- Consolidate different types of adapter.
- Others: Run in multiple GPUs and nodes via MPI4Py. Introduce PyOMP for CPU threading. Implement particle consolidation in history-based for GPU run. ...