Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add global Celeritas input definition #1562

Draft
wants to merge 28 commits into
base: develop
Choose a base branch
from

Conversation

sethrj
Copy link
Member

@sethrj sethrj commented Jan 6, 2025

Related to #1556 , I'd like to specify a unified mechanism for creating core params, states, etc. before continuing work on that front. Simultaneously (in light of the Optical work) we've been discussing a "refactored" IO where we specify Celeritas inputs to classes separately from the classes themselves. This ties in with #1204 and #1263 and the unification of RunInput/RunnerInput/SetupOptions.

As a first step, I've defined a new directory and namespace inp where we can move all the "input"/"option" classes into. For now we can hand-roll these in C++ (and adapt them from existing input classes), but I'd like to keep working on celeritas-project/celerpy#4 . We should get some working C++ structs before writing the python code and adapters, but ChatGPT seems able to help here, since it was able to generate this from the new inp::Tuning C++ class:

class Tuning(BaseModel):
    """Set up system/tuning parameters that don't affect physics.

    Defaults:
        - `track_order`: `init_charge` on GPU, `none` on CPU.
    """

    capacity: StateCapacity
    "Per-process state sizes."

    optical_capacity: Optional[StateCapacity] = None
    "Per-process state sizes for *optical* tracking loop."

    num_streams: PositiveInt = 0
    "REMOVE: number of streams."

    device: Optional[Device] = None
    "Optional: activate GPU."

    track_order: Optional[str] = None
    "Track sorting and initialization."

    warm_up: bool = False
    "Perform a no-op step at the beginning to improve timing measurements."

    seed: PositiveInt
    "Random number generator seed."

    environ: Dict[str, str]
    "Environment variables used for program setup/diagnostic."

I used ChatGPT to help with the initial translation of SetupOptions, RunInput, and RunnerInput to the new unified inp::Input class.

Note that for now we will need to restrict the along-step field choices to our built-in ones. I'd like to extend so that we can use more type-safe attributes (e.g. ParticleId) along @hhollenb 's line of thinking.

If this sounds good, I'll add documentation to the manual about how I expect this will be used. Follow-on PRs will:

  • create core params from inp data
  • support JSON->inp conversion (and output?? maybe instead we should have output of the lower-level data)
  • move FooParams::Input, FooParams::Option, FooOptions, etc. into inp structures
  • and then continue with refactoring.

Talking this through with @hhollenb I think we'll want to separate "standalone" driver functionality (primary generator definition) from the rest of it.

Goals

There are three categories of input types:

  • Exclusive Celeritas inputs such as state size and other diagnostic/tuning parameters. Also some parameters we cannot deduce from Geant4: e.g., sensitive detector attributes.
  • Parameters that we want to use to drive Geant4 for celer-g4, or pull from Geant4 if we're not. This would include geometry definition (GDML), EM parameters, active processes, scoring meshes.
  • Problem setup options that we cannot directly understand from Geant4 but must be provided directly to lower-level Celeritas objects. In particular, we need to be able to allow users to add custom processes, magnetic fields, etc.

Reduce duplication of CoreParams construction and front end input

Currently celer-sim, celer-g4, and direct framework coupling all have slightly different interfaces. They also all independently construct CoreParams. I want to be able to use all the built-in functionality of Celeritas from any front end.

Reduce duplication of input parameters

Currently for celer-g4, we have (using linear_loss_limit as an example):

JSON/macro → GeantPhysicsOptions → CelerEmPhysicsList → Geant4 → GeantImporter → ImportEmParameters → PhysicsParamsOptions::linear_loss_limit → PhysicsParamsScalars

This is a crazy amount of duplication and hard to keep track of where parameters come from and go. The duplication makes it harder to extend: to add support for region/particle-dependent limits we'd have to change a lot of code.

Unify and refactor input to classes

A data-oriented approach will let us restructure e.g. Sim/TrackInit params, as well as Physics, since the input can be pulled from multiple input structs. It also provides a single location where input structs are defined, rather than scattered throughout the codebase. That will ease the transition to a JSON front end and remove a lot of the assignment of foo.x = bar.x between input classes.

Enable well-defined extension points

The current generic SetupOptions only gives one way to extend the actions: make_along_step. Instead we want to be able to extend actions, along-step, physics processes, etc.

@sethrj sethrj added enhancement New feature or request app Application front ends labels Jan 6, 2025
Copy link

github-actions bot commented Jan 6, 2025

Test summary

1 712 files  2 704 suites   50s ⏱️
1 108 tests 1 097 ✅ 11 💤 0 ❌
8 928 runs  8 904 ✅ 24 💤 0 ❌

Results for commit c1c852f.

♻️ This comment has been updated with latest results.

Copy link
Contributor

@hhollenb hhollenb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused about how this input fits into building celeritas. It seems like there's quite a bit of logic between the inp structs and what would get passed directly to Params.

My personal imagining is that there's a layer of purely input / user configuration that gets canonicalized into a single format that is used to initialize celeritas. Inputs from prebuilt binaries like celer-sim and celer-g4 would have default and easy ways to build the canonical input. If users link celeritas as a library and want to hook in to alter the defaults and force their desired configurations, then they can do so in between the "build input" level and the "initialize celeritas" level. This could very much be me misunderstanding the purpose / existing state of the initialization as well.

*
* TODO: refactor ignore_processes so it ties in the with IO classes.
*/
struct Physics
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This struct feels a little weird to me. Is it meant to be deciding which IO method to use for importing physics, or is it meant to be what is directly passed into PhysicsParams? It feels like it's in charge of handling different importing routines that require different parameters (from Geant4, from files, etc) while also managing further levels of logical filtering and modification of the imported data (ignoring processes, selecting particles / processes, etc).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking some classes will get passed in the whole Input struct, others (e.g. sim params) might get one or two. Maybe this structure will be shared by the geant "setup" (if we're driving Geant4), and the geant "exporter", and the physics.

Maybe physics_file should be a variant switching between "import physics from file" and "import physics from Geant4"? Although that ends up pulling in a bunch of EM options as well 🤔 maybe I should sketch this part out a bit more.


//---------------------------------------------------------------------------//
//! Field type
using Field = std::variant<NoField, UniformField, RZMapField>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having everything be structs + variants which will have their respective functions to dispatch to does make me think of just a bog-standard OOP approach. Not necessarily the best idea but might be good to keep in mind why OOP wouldn't fit and what the final goal of this input format is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand your comment. I'm thinking of this as a DOP approach, so we can serialize all the options as JSON and communicate them back and forth; and also allow code restructuring by passing all the data into the Params constructors if we need to.

I primarily envision the structs as a driver for built-in Celeritas objects. I also will be adding callback functions for parts that we want to be arbitrarily extensible (i.e., adding actions). I think that such a compartmentalized approach is more flexible and would work better than the Geant4/dd4hep "override a class to one thing or another" method, if that's what you mean by OOP.

@sethrj
Copy link
Member Author

sethrj commented Jan 8, 2025

I've been thinking about this a bit more too. Let's say the goal is to have all the input data, aside from the in-memory geometry definition (which we do have an issue to create a standalone 'model' from). We have I think three kinds of inputs:

  • Exclusive Celeritas inputs: state size, other diagnostic and tuning parameters
  • Parameters that we want to use to drive Geant4 for celer-g4, or pull from Geant4 if we're not. This would include magnetic field, EM parameters, active processes, and maybe someday the scoring setup.
  • Problem setup options that we cannot directly understand but must be provided directly to lower-level Celeritas objects. In particular, we need to be able to allow users to add custom processes, magnetic fields, etc.

I'll be thinking more about this before our meeting this afternoon...

@sethrj
Copy link
Member Author

sethrj commented Jan 9, 2025

@hhollenb Based on our discussions: thoughts?

options

@sethrj
Copy link
Member Author

sethrj commented Jan 9, 2025

options

Updated. I don't think it's feasible to define an "updater" that merges two structs (that would be super tedious, potentially confusing, etc.). So what I think we want is that the input is defined once (or in the case of framework/user application, not at all), and then updated sequentially by different helper functions:

  • Geant4 importer: fulfills the same role as the current I/O, but the input (and updated input) will pass through Celeritas-specific options too.
  • Data file reader: given G4 data file paths and the materials in the input, import SB tables and such
  • JSON reader: can be used to update tuning/diagnostic options from an existing export file
  • Application-defined input adjuster: will set up field, user process callbacks, etc. by directly modifying the input structs.
  • (from Soon) User-friendly physics list constructors

Then the input gets passed into the params construction and Celeritas takes it from there (using callback functions to add user actions, processes, etc.).

The input should ideally only contain nonderivative data: e.g., microscopic cross sections, not macroscopic. We could eventually have the inputs contain other currently hardcoded data, like the element parameters in RayleighModel, so that in the future they could be migrated to different formats.

@hhollenb
Copy link
Contributor

hhollenb commented Jan 9, 2025

I think it looks good! Addresses my primary concern about having a separate interface for input and initializing, and looks extensible enough for different application types. I'm sure actually implementing and refactoring will have its own slew of issues, but I think this is a solid starting point.

@hhollenb
Copy link
Contributor

hhollenb commented Jan 9, 2025

A brief thought I had while looking at it during the meeting: the driving Geant4 flow looks pretty much identical to the framework flow. If we treat driving Geant4 as a kind of "trivial framework", it would reduce our use cases and act as an example framework for us to develop around or a starting point for users.

@sethrj
Copy link
Member Author

sethrj commented Jan 9, 2025

The reason I broke out the "driving geant4" was to ensure there was a good place to put the "standalone input" code for setting up Geant4. In practice, yes the Geant4 importer should be doing most of the work.

@sethrj
Copy link
Member Author

sethrj commented Jan 10, 2025

One gotcha is that we want some "system" stuff (device, MPI, environment, logger) to be configured before running any celeritas code, including the importers (which use profiling and logger).

I think I'll make System a separate configuration that has to be set first... 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
app Application front ends enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants