PAHFIT 2022

Towards an updated PAHFIT Model/Interface

Based on discussions of the April 4th PAHFIT all-hands workshop.

Instrument/Science Packs

Separating the concerns of instrumental artifacts and scientific model details is important. Hence we have proposed separate “instrument” and “science” packs. Internally, information from these two types of packs is combined into a pahfit.parameters.Table object, which is used both for setting up the models, and inspecting (possibly combined) model results.

Instrument Packs

The primary information an instrument pack can convey to the model is:

The wavelength range of validity of a given spectral segment/channel/order.
The Gaussian full-width of the line spread function at any valid wavelength, and its uncertainty (if any).

Other instrumental details could be implemented, which are “uninteresting” scientifically. Since instrument packs are “fixed”, and implicitly determine which instruments PAHFIT supports, and since different instruments may have different degrees of complexity in their underlying calculations of line widths, a simple class structure for each fully distinct instrument or configuration will suffice. Example:

pahfit.instruments.spitzer.irs.sl1
pahfit.instruments.spitzer.irs.lh
pahfit.instruments.akari.irc.nirgrism
pahfit.instruments.iso.sws
pahfit.instruments.jwst.nirspec.R2700
pahfit.instruments.jwst.miri.mrs.ch1
pahfit.instruments.jwst.miri.lrs

Each such class inherits from the pahfit.instruments.Instruments class, and defines two simple methods:

fwhm(wavelength) returns a tuple of the full-width-half maximum of an unresolved line at that rest wavelength, and the width uncertainty (fwhm, fwhm_unc). fwhm_unc is ideally None.
wavelength_range returns a (wave_low, wave_high) range of validity, which may not encompass the entire spectral range on the array (trimming the bad ends).
All wavelengths are in microns.

Science Packs

Science packs contain all the scientific model configuration details for building a PAHFIT model. These include the list of lines and features present, and the range of values their underlying parameters take. PAHFIT will distribute a few carefully vetted science packs, but anticipate users may make small alterations to these, or create their own. The heart of PAHFIT is its science packs. To offers a flexible format for defining packs, including grouping and identifying components for easier use, PAHFIT employs a simple YAML heirarchical plain-text format, similar to many configuration file formats. Note that internally, science packs are converted into a pahfit.parameters.Table tabular format.

An outline of the syntax of a PAHFIT’s YAML science packs:

Top level: a named set of individual features (like a single line), or groups of features (like a set of lines, continuum components, etc.)
Elements: Each group or feature element consists of a name key, and associated attribute info. The name is just a handy means of selecting and working with a feature or a group, and is never interpreted. All top-level names in a model must be distinct, but otherwise, can be anything.

Each feature or group can have several other associated keys:

kind: The kind of a feature. Current kinds, with description and the feature’s format:

kind	Description	Underlying Model
`starlight_continuum`	continuum from starlight emission	Blackbody
`dust_continuum`	underlying dust emission continuum	Modified Blackbody
`line`	individual molecular/ionic emission lines	Gaussian
`dust_feature`	resolved dust emission features	Drude (possibly modified)
`attenuation`	bulk attenuation, including silicate bands	named model
`absorption`	individual absorption bands, such as H20	Drude

Other attributes, specific to each feature kind (see below).

In addition, groups can have the following keys:
- kind: When set on a group, kind is inherited by all features in the group.
- features: A named set of features in a group, each with their own attributes.
- ‘attribute’s (note plural): A list or named set of a particular attribute for a group (e.g. wavelengths). If names are not included, they are generated from the group name. If multiple attribute lists are included in a single group, they must have the same length, and names (if any) are taken from the first named set mentioned.

Feature attribute format

Feature attributes can be included in one of several related formats, here illustrated with the “wavelength” parameter:

A simple value (fixed, with bounds allowed):
```
wavelength: 9.7
    
```
A list of simple fixed values, in a group (note plural wavelengths):
```
cool_group:
    wavelengths: [9.7, 10.2, 15.8]
    
```

A named set of values, inside a group (note plural):

cool_group:
    wavelengths:
        line1: 9.7
        other_line: 10.2

With explicit parameter attributes:

explicit_group:
    features:
        KrVII:
            wavelength:
                value:  12.813
                bounds: [-10%, 1%]
        PuVII:
            wavelength: 1.44444

The latter is the only means of setting all attributes for a parameter.

Attributes by kind

Attributes available depend on the feature’s kind:

All features have an implicit name (from the key atop the feature).
All features have a kind from a fixed set (see above), which can be inherited from their group.
Resolved features (line, dust_feature) have:
- wavelength - wavelength, in microns
- fwhm - full-width at half-maximum, in microns
- power - total integrated power or intensity (normally unspecified, by default non-negative bounds). Unit depends on units of input spectra (Hz * [unit])
Continua (starlight and dust_continuum) have:
- temperature (in Kelvin degrees)
- tau - the emission depth or strength of the continuum (normally unspecified, by default non-negative)
Dust features have:
- wavelength (in microns)
- fwhm (in microns)
- mod_asym - modified Drude asymmetry parameter (unitless)
- power
Attenuation can have:
- tau (normally unspecified, by default non-negative)
- model - a name of the attenuation model to use
- geometry - name, currently mixed or screen, of the attenuation geometry to use. If unspecified, mixed is assumed.
Absorption features have:
- tau
- wavelength
- fwhm
- geometry

Each feature attribute can be set to a value directly, or have the following associated information:

value - the (single) value of the parameter.
bounds:
- a list of bounds [low, high]
- each of low or high can be:
  - a number
  - null for no bound
  - A percentage, interpreted as a percent offset from value. E.g. [-10%, 10%] for 10% range on either side of value, or [-5%, 0%] for a lower bound up to the value. As a convenience, if the either bound is a percentage, a plain 0 in the other bound will be interpreted as a hard stop at the value itself, i.e. the [-5%, 0] is equivalent to the above.
- By default there are no bounds, i.e. the parameter is fixed at its given value.
- Including the bounds [null, null] means the parameter will freely vary.
- bounds can also be set at a group level for groups specifying only a single attribute, to set the common bounds for that attribute for all features in the group.

Any attribute which is unmentioned is allowed to vary, with a default lower bound of 0, and no upper bound ([0, null]).

Example science pack

An example of the format, with several groups, and all parameter kinds:

# Starlight continuum [attributes: tau, temperature]
starlight:  # Blackbody
    kind: starlight_continuum
    temperature: 5000

# Dust continuum [attributes: tau, temperature]
dust_cont: # Modified Blackbodies
    kind: dust_continuum
    # shorthand for a group of dust_continuum features.  Naming is automatic.
    temperatures: [300, 200, 135, 90, 65, 50, 40, 35]

# Unresolved Lines [attributes: wavelength, power]
H2:  # simple line group, with common bounds for all lines
    kind: line # Gaussians
    bounds: [-0.1%, 0%] # bounds can be set for the entire group, here leftward only
    # Shorthand for a group of fixed lines.  Names specified.
    wavelengths: 
        H2_S(7): 5.5115 
        H2_S(6): 6.1088

# Another group of unresolved lines
ionic_lines: 
    kind: line 
    # Full form of lines [attributes: wavelength, power]
    features:
        NeII:
            wavelength:
                value:  12.813
                bounds: [-10%, 1%] # fractional bounds for wavelength
        NeIII:
            wavelength: 15.555
            power:
                tied: NeIII        # experimental/unimplemented
                ratio: [3, 5]

# Dust features [attributes: wavelength, power, fwhm, mod_asym]
PAH77:
    kind: dust_feature # Drude profiles
    features:
        PAH77a:
            wavelength: 7.42
            fwhm: 0.935
        PAH77b:
            wavelength: 7.60
            fwhm: 0.3344
        PAH77c:
            wavelength: 7.85
            fwhm:
                value: 0.416
                bounds: [-5%, 15%]
            mod_asym: 0.25 # indicates a modified/enhanced-asymmetry Drude profile.

# Attenuation [attributes: tau, extinction_model]
silicate:
    kind: attenuation
    extinction_model: kvt

# Absorption [attributes: tau, wavelength, fwhm
absorption_features:
    kind: absorption
    features:
        H2O_3:
            wavelength: 3.05
            fwhm:
                value: .406
                bounds: [-1%, 10%]

Example Usage

Some example usage for preparing and interacting with a model in this UI.

Use Spectrum1D for inputs:

PAHFIT uses raw Spectrum1D objects, as long as they are tagged with meta information identifying the instrument pack, e.g. instrument.jwst.nirspec.R1000. Such instrument names key directly into pahfit.instruments, which exposes instrument packs via a class heirarchy.

Once the joint-fitter is working, a single model created and updated with multiple Spectrum1D objects from different instruments (see advanced examples, below).

An example of creating a simple Spectrum1D with tagged instrument:

from specutils import Spectrum1D
from astropy.nddata import StdDevUncertainty
import astropy.units as u
import pahfit

# Get yerself a Spectrum1D object somehow
N=100
redshift = 0.05
flux = np.random.rand(N)*10 * u.mJy
flux_unc = StdDevUncertainty(flux/(5+10*np.random.rand(100)))
wav = np.linspace(5,28,N)*u.micron
spec = Spectrum1D(spectral_axis=wav, flux=flux, uncertainty=flux_unc,
                  redshift = redshift, meta={"instrument": "jwst.miri.MRS.segment1"))
spec = Spectrum1D(spectral_axis=wav, flux=flux, uncertainty=flux_unc,
                  redshift = redshift, meta={"instrument": "jwst.miri.MRS.segment1"))

Building a model:

A model is created by combining one or more instrument packs and a science pack. The instrument pack(s) are inferred directly from the input data: a Spectrum1D (or list of Spectrum1Ds), whose metadata knows what instrument/mode the spectrum pertains to.

# Pick desired sci-pack. Creates an underlying params pahfit.parameters.table
scipack  = 'extended_exgal.yaml'

# Create a model, from the instrument-tagged spectra and science packs.
mymodel  = pahfit.model(spec, scipack)

Estimating parameter starting guess

Once a model is created, its starting point can be estimated (guessed). An advantage of separating this step is it’s easy to override, e.g. for cubes (where “borrowing” an average starting position will speed convergence).

mymodel.guess(spec)
mymodel.plot(spec) # Plot the "guess", with the data

Performing the fit

Can provide updated model results during the fit.

mymodel.fit(spec, plot_progress=True)

Inspecting/altering parameter information

As noted above, creating a PAHFITModel internally creates an astropy.parameters.Table object, a lightly extended astropy.Table. After any fit, it also includes the model-estimated covariance information (if any), and can combine arbitrary combinations of features (with uncertainties, of the covariance is available). Since it is also an astropy.Table, all the group_by etc. functionality of that class can be employed. It also offers its own simple convenience selection functionality for group and kind.

# Get parameter Table
pm = om.params # all parameters, incuding name, kind, group, and other details
om.get_params('NeIII') # By name (same as pm[pm['name'] == 'NeIII'])
om.get_params(group='H2') # same as pm[pm['group'] == H2']

pah77 = om.get_params(group='PAH77').power # integrated intensity and estimated uncertainty, from covariance
total_lines = om.get_params(kind='line').power

Plotting models

Any model can plot itself, with or without underlying data. Each model knows the (possibly disjoint) range of wavelengths to which it applies, and can sample the model with samples points over that range (for smoother plots)_.

The plot will contain a simple widget for toggling on/off by kind and group.

nsamp = 2000
mymodel.plot(samples = nsamp)  # plot, without data!
mymodel.plot(spec, samples = None) # Plot with data at same wavelength positions

mymodel.plot(include = ("continuum","lines"))

Saving/restoring models from disk

Models can be saved in any format astropy.Table supports, though the covariance can only be saved to formats that support metadata (like asdf and ecsv).

# Save to an ASDF (aka FITS TNG) file
mymodel.save(file='my_PAHFIT_model.asdf')

# Or save the table representation to a plain text file
mymodel.save(file='my_PAHFIT_model.ecsv')

# Restore a model from disk and plot it
om = pahfit.model.restore('another_older_model.ipac')
om.plot()

Inspecting model results

Separating out components of of the model spectrum is easy. This can be done at any time, before or after the model has initial starting parameters estimated, or is fit to data.

Note that all returned model spectra are masked arrays, over relevant parameter bounds only. These model results could be Spectrum1D objects (or a lightly overloaded version that uses microns for the wavelength by default!).

Same samples keyword as for plotting.

# Model "spectrum"
model_fit = om.model # full model spectrum
pahh11 = om.get_model('PAH11.0', samples = nsamp) # Model of individual feature, by name

# Grouped features are specified in the sci-pack.  
pah7 = om.get_fit(group='PAH77') # w/ default sampling
all_lines = om.get_fit(kind='line')
h2lines = om.get_fit(group='H2') # All H2 lines (by group)
neIII = om.get_fit('NeIII', samples=nsamp) # Just one line, by name

# Refer to individual components by name
cont = om.get_fit(spec, kind='continuum') # at same sampling as the spectrum
starlight = om.get_fit('starlight') # default sampling

# Directly use the lower-level plot engine for plotting any arbitrary model combinations
pahfit.plot(spec, cont+starlight+neIII)

More advanced examples

Fitting spectra from multiple instruments at the same time (may include multiple segments/channels of the same instrument).

pars = mymodel.params
print('Model table: ', pars.info())
pars.show_in_notebook()

# Advanced: do stuff to the underlying astropy.Table if you are
# feeling cheeky
long_lines = pars['kind'] == 'line' & pars['lam_r'] > 15.
pars['amp'][long_lines] *= 1.1

# Create a model to simultaneous fit with multiple different instruments, use SpectrumCollect as the "grouping"
full_spec=SpectrumCollect(...,...)

my_combo_model = pahfit.model((specNS, specMRS), scipack)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly