-
Notifications
You must be signed in to change notification settings - Fork 26
PAHFIT 2022
Based on discussions of the April 4th PAHFIT all-hands workshop. Updated May, 2022
Separating the concerns of instrumental artifacts and scientific model details is important. Hence we have proposed separate “instrument” and “science” packs. Internally, information from these two types of packs is combined into a pahfit.features
table, which is used both for setting up the models, inspecting model results, or altering details of the fit.
- Feature(s)
- A feature is one component of the PAHFIT model, such as a single emission line like
[NeII]
, or a single dust feature. - Feature Group(s)
- A group of features that are conceptually related. For example a group of sub-features for a composite dust feature (e.g.
PAH77
). - Parameter
- Some value used to define a feature, e.g.
wavelength
for a line. - Attribute
- Information about a parameter, such as its
value
orbounds
.
The primary information an instrument pack can convey to the model is:
- The wavelength range of validity of a given spectral segment/channel/order.
- The Gaussian full-width of the line spread function at any valid wavelength, and its uncertainty (if any).
Other instrumental details could be implemented, which are “uninteresting” scientifically. Since instrument packs are “fixed”, and implicitly determine which instruments PAHFIT supports, and since different instruments may have different degrees of complexity in their underlying calculations of line widths, a simple class structure for each fully distinct instrument or configuration will suffice. Example:
pahfit.instruments.spitzer.irs.sl1 pahfit.instruments.spitzer.irs.lh pahfit.instruments.akari.irc.nirgrism pahfit.instruments.iso.sws pahfit.instruments.jwst.nirspec.R2700 pahfit.instruments.jwst.miri.mrs.ch1 pahfit.instruments.jwst.miri.lrs
Each such class inherits from the pahfit.instruments.Instruments
class, and defines two simple methods:
-
fwhm(wavelength)
returns a tuple of the full-width-half maximum of an unresolved line at that rest wavelength, and the width uncertainty (fwhm
,fwhm_unc
).fwhm_unc
is ideallyNone
. -
wavelength_range
returns a (wave_low
,wave_high
) range of validity, which may not encompass the entire spectral range on the array (trimming the bad ends). - All wavelengths are in microns.
Science packs contain all the scientific model configuration details for building a PAHFIT model. These include the list of features present, and the range of values their underlying parameters take. PAHFIT will distribute a few carefully vetted science packs, but anticipate users may make small alterations to these, or create their own. The heart of PAHFIT is its science packs. To offers a flexible format for defining packs, including grouping and identifying components for easier use, PAHFIT employs a simple YAML heirarchical plain-text format, similar to many configuration file formats. Note that internally, science packs are converted into a pahfit.features
tabular format.
- Top level: a named set of individual features (like a single line), or groups of features (like a set of lines, continuum components, etc.)
-
Elements: Each group or feature element consists of a
name
key, and associated attribute info. Thename
is just a handy means of selecting and working with a feature or a group, and is never interpreted. All top-level names in a model must be distinct, but otherwise, can be anything. - Each feature or group can have several other associated keys:
-
kind
(required): The kind of a feature. Current kinds, with description and the feature’s underlying model type:kind Description Underlying Model starlight_continuum
continuum from starlight emission Blackbody dust_continuum
underlying dust emission continuum Modified Blackbody line
individual molecular/ionic emission lines Gaussian dust_feature
resolved dust emission features Drude (possibly modified) attenuation
bulk attenuation, including silicate bands named model absorption
individual absorption bands, such as H20 Drude - Other attributes, specific to each feature kind (see below).
-
- In addition, groups can have the following keys:
-
kind
: When set on a group,kind
is inherited by all features in the group. Every group or feature has akind
. -
features
: A named set of features in a group, each with their own attributes. - ‘attribute’s (note plural): A list or named set of a particular attribute for a group (e.g.
wavelengths
). If names are not included, they are generated from the group name. If multiple attribute lists are included in a single group, they must have the same length, and their names (if any) are taken from the first named set which appears..
-
Note: Any parameter which is left unspecified in a science pack is allowed to vary, with a default lower bound of 0, and no upper bound (~[0, null]~). This would typically include line and feature power, continuum and attenuation τ, etc. (though these of course can be limited).
The parameters available depend on the feature or group’s kind
. All features have an implicit name (from the key atop the feature). All features have a kind
, taken from a fixed set (see above), or inherited from their group.
- Continua (
starlight
anddust_continuum
) have:-
temperature
- BB temperature, in Kelvin degrees -
tau
- the emission strength of the continuum component (f_ν units, normally unspecified)
-
- Resolved features (
line
,dust_feature
) have:-
wavelength
- wavelength, in microns -
fwhm
- full-width at half-maximum, in microns -
power
- total integrated power or intensity (normally unspecified). Unit depends on units of input spectra (Hz * [unit])
-
- Attenuation can have:
-
tau
(normally unspecified) -
model
- a name of the attenuation model to use -
geometry
- name, currentlymixed
orscreen
, of the attenuation geometry to use. If unspecified,mixed
is assumed.
-
- Absorption features have:
-
tau
- (unitless, maximum optical depth at feature center) wavelength
fwhm
geometry
-
A proposed but not yet implemented parameter for dust features:
-
mod_asym
- modified Drude asymmetry parameter (unitless, proposed)
Feature attributes can be included in one of several related formats, either individually or in a group. Here feature attributes are illustrated with the “wavelength” parameter:
- A simple value (fixed, with bounds allowed):
wavelength: 9.7
- A list of simple fixed values, in a group:
cool_group: wavelength: [9.7, 10.2, 15.8]
- With group-wide single parameter
bounds
:cool_group: bounds: [-0.1#, 1%] wavelength: [9.7, 10.2, 15.8]
- A named set of values, inside a group (note plural):
cool_group: wavelength: line1: 9.7 other_line: 10.2
- With explicit parameter attributes:
explicit_group: features: KrVII: wavelength: value: 12.813 bounds: [-10%, 1%] PuVII: wavelength: 1.44444
The latter is the only means of setting all attributes for a parameter. See below for more information on available parameter attributes.
Each feature parameter can be set to a numerical value directly, or have the following associated attributes specified:
-
value
- the (single) numerical value of the parameter. -
bounds
:- a 2-element list of bounds:
[low, high]
- each of
low
orhigh
can be:- a number
-
null
for no bound - An offset value, created by appending an
#
to the value. E.g.[-0.1#, 0.1#]
specifies a fixed offset of plus or minus0.1
. - A percentage, interpreted as a percent offset from value. E.g.
[-10%, 10%]
for 10% range on either side ofvalue
, or[-5%, 0%]
for a lower bound up to the value itself.
- By default there are no bounds, i.e. the parameter is fixed at its given value.
- Including the bounds
[null, null]
means the parameter will freely vary (as is the default). - As a special case,
bounds
can also be set at the group level.- For groups specifying only a single parameter,
bounds
can be a simple 2-element array as above. This sets the common bounds for that parameter for all features in the group. - For groups specifying multiple parameters,
bounds
can be a dictionary overparameter
:bounds
. Any parameter which is not included has no bounds.
- For groups specifying only a single parameter,
- a 2-element list of bounds:
Proposed, but as yet unimplemented parameter attribute information:
-
tied
- (proposed, unimplemented) Another name of a parameter to which this parameter is tied. -
ratio
- (proposed, unimplemented) Iftied
is set, the ratio (single value, fixed) or ratio bounds (two element array, varying; seebounds
) of this parameter’s value to the tied parameters’s value. Use this to tie two lines or features together with a fixed or modestly varying ratio, typically of power or wavelength.
An example of the science pack format, with several groups, and all parameter kinds:
# Starlight continuum [parameters: tau, temperature]
starlight: # Blackbody
kind: starlight_continuum
temperature: 5000
# Dust continuum [parameters: tau, temperature]
dust_cont: # Modified Blackbodies
kind: dust_continuum
# shorthand for a group of unnamed dust_continuum features. Naming is automatic.
temperature: [300, 200, 135, 90, 65, 50, 40, 35]
# Unresolved Lines [parameters: wavelength, power]
H2: # simple line group, with common bounds for all lines
kind: line # Gaussians
bounds: [-0.1%, 0%] # bounds can be set for the entire group, here leftward only
# Shorthand for a group of named lines.
wavelength:
H2_S(7): 5.5115
H2_S(6): 6.1088
# An explicit group of unresolved lines
ionic_lines:
kind: line
# Full form of lines [parameters: wavelength, power]
features:
NeII:
wavelength:
value: 12.813
bounds: [-10%, 1%] # fractional bounds for wavelength
NeIII:
wavelength: 15.555
power:
tied: NeII # experimental/unimplemented
ratio: [3, 5]
# A group of dust (sub-)features [parameters: wavelength, power, fwhm, mod_asym]
PAH77:
kind: dust_feature # Drude profiles
features:
PAH77a:
wavelength: 7.42
fwhm: 0.935
PAH77b:
wavelength: 7.60
fwhm: 0.3344
PAH77c:
wavelength: 7.85
fwhm:
value: 0.416
bounds: [-5%, 15%]
mod_asym: 0.25 # indicates a modified/enhanced-asymmetry Drude profile.
# Attenuation [parameters: tau, extinction_model]
silicate:
kind: attenuation
extinction_model: kvt
# Absorption [parameters: tau, wavelength, fwhm
absorption_features:
kind: absorption
features:
H2O_3:
wavelength: 3.05
fwhm:
value: .406
bounds: [-1%, 10%]
Some example usage for preparing and interacting with a model in this UI.
PAHFIT can operate on raw Spectrum1D
objects (or iterables of such objects), as long as each is tagged with meta
information identifying the instrument, e.g. instrument.jwst.nirspec.R1000
. Such instrument names key directly into pahfit.instruments
, which provide relevant instrument information to include in the model.
Once the joint-fitter is working, a single model can be created and updated with multiple Spectrum1D
objects, from different instruments (see advanced examples, below).
An example of creating a simple random Spectrum1D
with tagged instrument:
from specutils import Spectrum1D
from astropy.nddata import StdDevUncertainty
import astropy.units as u
import pahfit
# Get yerself a Spectrum1D object somehow
N=100
redshift = 0.05
flux = np.random.rand(N)*10 * u.mJy
flux_unc = StdDevUncertainty(flux/(5+10*np.random.rand(100)))
wav = np.linspace(5,28,N)*u.micron
spec = Spectrum1D(spectral_axis=wav, flux=flux, uncertainty=flux_unc,
redshift = redshift, meta={"instrument": "jwst.miri.MRS.segment1"))
spec = Spectrum1D(spectral_axis=wav, flux=flux, uncertainty=flux_unc,
redshift = redshift, meta={"instrument": "jwst.miri.MRS.segment1"))
N.B. “redshift” may not work as anticipated here.
A model is created by combining one or more instrument packs and a science pack. The instrument pack(s) are inferred directly from the input data: a Spectrum1D
(or list of Spectrum1Ds), whose metadata
knows what instrument/mode the spectrum pertains to.
# Pick desired sci-pack. Creates an underlying params pahfit.features table.
scipack = 'extended_exgal.yaml'
# Create a model, from the instrument-tagged spectra and science packs.
mymodel = pahfit.model(spec, scipack, instruments=(pahfit.instruments.jwst.mrs.ch4)
Note: a given scipack should
Once a model is created, its starting point can be estimated (guessed). An advantage of separating this step is it’s easy to override, e.g. for cubes (where “borrowing” an average starting position will speed convergence).
mymodel.guess(spec)
mymodel.plot(spec) # Plot the "guess", with the data
Can provide updated model results during the fit.
mymodel.fit(spec, plot_progress=True)
As noted above, creating a PAHFITModel
internally creates an astropy.features
object, which extends astropy.Table
. After any fit, it also includes the model-estimated covariance information (if any), and can combine arbitrary combinations of features (with uncertainties, of the covariance is available). Since it is also an astropy.Table
, all the group_by
etc. functionality of that class can be employed. It also offers its own simple convenience selection functionality for group
and kind
.
# Get feature Table
fts = om.features # all features as a table
om.get_features('NeIII') # By name (same as fts[fts['name'] == 'NeIII'])
om.get_features(group='H2') # same as fts[fts['group'] == H2']
pah77 = om.get_features(group='PAH77').power # integrated intensity and estimated uncertainty, from covariance
total_lines = om.get_features(kind='line').power
Any model can plot itself, with or without underlying data. Each model knows the (possibly disjoint) range of wavelengths to which it applies, and can sample the model with samples
points over that range (for smoother plots)_.
The plot will contain a simple widget for toggling on/off by kind
and group
.
nsamp = 2000
mymodel.plot(samples = nsamp) # plot, without data!
mymodel.plot(spec, samples = None) # Plot with data at same wavelength positions
mymodel.plot(include = ("continuum","lines"))
Models can be saved in any format astropy.Table
supports, though the covariance can only be saved to formats that support metadata
(like asdf and ecsv).
# Save to an ASDF (aka FITS TNG) file
mymodel.save(file='my_PAHFIT_model.asdf')
# Or save the table representation to a plain text file
mymodel.save(file='my_PAHFIT_model.ecsv')
# Restore a model from disk and plot it
om = pahfit.model.restore('another_older_model.ipac')
om.plot()
Separating out components of the model spectrum is easy. This can be done at any time, before or after the model has initial starting feature parameters estimated, or is fit to data.
Note that all returned model spectra are masked arrays, over relevant parameter bounds only. These model results could be Spectrum1D
objects (or a lightly overloaded version that uses microns for the wavelength by default!).
Same samples
keyword as for plotting.
# Model "spectrum"
model_fit = om.model # full model spectrum
pahh11 = om.get_model('PAH11.0', samples = nsamp) # Model of individual feature, by name
# Grouped features are specified in the sci-pack.
pah7 = om.get_fit(group='PAH77') # w/ default sampling
all_lines = om.get_fit(kind='line')
h2lines = om.get_fit(group='H2') # All H2 lines (by group)
neIII = om.get_fit('NeIII', samples=nsamp) # Just one line, by name
# Refer to individual components by name
cont = om.get_fit(spec, kind='continuum') # at same sampling as the spectrum
starlight = om.get_fit('starlight') # default sampling
# Directly use the lower-level plot engine for plotting any arbitrary model combinations
pahfit.plot(spec, cont+starlight+neIII)
Fitting spectra from multiple instruments at the same time (may include multiple segments/channels of the same instrument).
fts= mymodel.features
print('Model table: ', fts.info())
fts.show_in_notebook()
# Advanced: do stuff to the underlying astropy.Table if you are
# feeling cheeky
long_lines = fts['kind'] == 'line' & fts['lam_r'] > 15.
fts['amp'][long_lines] *= 1.1
# Create a model to simultaneous fit with multiple different instruments
my_combo_model = pahfit.model((specNS, specMRS), scipack)
Python PAHFIT is somewhat slow, a bit slower (on my machine) than IDL PAHFIT was. Once typical spectra have 5-10x as many wavlelength samples, and the model is expanded to have 3-5x as many parameters, this will really start to hurt. More than several minute fit times on decent hardware are quite likely. Once the model has settled a bit more, striving for performance gains will be important.
Example: M101_Nucleus_irs.ipac
(from /data/
)
With default settings (xtol=ftol=gtol=1D-10
), the fit converges in:
- 71 “iterations”, using 6754 function evals
- 3.9 seconds
- 1732 FEVs/second
With the default acc=eps=1D-10
, the fit converges in:
- 6.6s
- 814 function evaluations
- 123 FEVs/second
So Python PAHFIT (including overhead) is about 14x slower per model function call, but makes up some of this by converging in fewer calls.
Fitting the Akari + Spitzer SL/LL template spectrum is quite slow at present (v2.1)…
- 58.7s
- 4566 FEVs
- 77.8 FeVs/sec (oy!)
Much slower with the new science pack. A profiler report indicates much of this time in spent
Anecdotal evidence is that these spectra take minutes to hours to converge.
Allowing all lines and features to vary in position and width is very expensive. And hopefully (for unresolved lines), unnecessary.
A good instrument and science pack model should nail the position and widths of lines precisely. Any mismatch in the residuals should be considered an “opportunity to improve” either the rest line wavelength in the sci-pack, or the fwhm estimation at each wavelength in the instrument pack. There is some concern about NIRSPEC wavelength calibration repeatability, but this should be corrected prior to PAHFIT’ing.
astropy.modeling
includes bounds of significance or bounding boxes. The classic case is a 2D star PSF on a large image; evaluating one stars PSF across the entire image is needlessly wasteful. Some savings may occur for lines and features.
5 * max(fwhm)
should do it (or 3x for Gaussians, 5x for Drudes; can investigate).
Should profile slow fits to see where the time is being spent. Mostly in the model function(s)?
Numba is a nice and very easy-to-use decorator-based framework for transpiling to C, with numpy support. It is not compatible with scipy functions (which themselves call to compiled C/FORTRAN code). But the bottommost objective function apparently can be @jit’d: Example of optimize.leastsq with Numba.
To use numba we would probably need to re-write the model as a set of simple functions call from the objective function, and give up on CompoundModel
.
The fit_deriv
callable for fittable models can be used to speed up derivative estimation.
This would mean giving up on the conveniences of the CompoundModel
, which in practice might mean just switching to scipy.least_squares
(newer than the scipy.leastsq
that LevMarLSQFitter
uses internally). It natively supports bounds.
A very common source of optimization not converging well is human error in the computation of the gradient. You can use scipy.optimize.check_grad()
to check that your gradient is correct. It returns the norm of the different between the gradient given, and a gradient computed numerically.
Some consider analytic derivatives not worth the overhead. Probably this would be a last resort.
scipy.optimize.leastsq
often (almost always?) returns None
for cov_x
, which leads to no param_cov
returned from astropy.modeling
. As stated in the leastsq
docs:
A value of
None
indicates a singular matrix, which means the curvature in parameters x is numerically flat.
It’s not entirely clear why such numerical flatness occurs. LevMarLSQFitter
simply sets up and drives leastsq
, forcibly truncating parameters at their boundaries during the fit, which may interfere with the numerically computed Jacobean/Hessian (this is also how MPFIT handled bounds). One idea is to use Hessian estimated by numdifftool after the model has converged at the best fit, and invert it for the covariance matrix. Would need to check normalization, as in this stackexchange post.
At R~3000 (100km/s resolution), non-negligible kinematic broadening or sub-structure may be present in complex regions. This is clearly not the domain of the instrument pack, but squarely in the science pack regime. But it woudl also add too much complexity to PAHFIT. A simple handling would be to allow (at model construction time) some sigma_velocity
, for quadrature addition to the instrument-pack-returned widths, to slightly dilate all lines/dust features. Or this could be set per line or group. Note that this doesn’t belong in the science pack, since it varies source to source.
Much (much) fancier methods exist. Many other tools specialize, and discriminate between various species in their kinematics. More sensible to zoom in on individual (sets of) lines and use one of those bespoke kinematic decomposition tools.