Merge branch 'master' into nirspec_bots_speedup_part1

t-brandt · Jun 26, 2024 · b5df212 · b5df212
2 parents fb50e41 + 619cb3b
commit b5df212
Show file tree

Hide file tree

Showing 19 changed files with 286 additions and 305 deletions.
diff --git a/CHANGES.rst b/CHANGES.rst
@@ -89,6 +89,16 @@ documentation
 
 - Added documentation for NIRCam GRISM time series pointing offsets. [#8449]
 
+emicorr
+--------
+
+- Improved running time by introducing a new parameter, use_n_cycles, that can
+  be modified by the user and has a default of 3. Now the user can either provide
+  the number of integrations to phase or the number of cycles to calculate the
+  number of integrations necessary to phase. Additionally fix a bug in applying
+  emicorr to data with nints > 1, and improve runtime in the phase amplitude
+  loop. [#8589]
+
 exp_to_source
 -------------
 
@@ -148,6 +158,9 @@ extract_2d
 - Added support for slit names that have string values instead of integer
   values, necessary for processing combined NIRSpec MOS and fixed slit
   data products. [#8467]
+
+- Assign slit ``source_xpos`` and ``source_ypos`` attributes here instead of
+  in ``wavecorr`` for NIRSpec FS data. [#8569]
 
 flat_field
 ----------
@@ -196,6 +209,9 @@ master_background
   Master background correction for MOS mode should be performed
   via ``master_background_mos``, called in ``calwebb_spec2``. [#8467]
 
+- Use zero values for master background outside the background
+  wavelength range instead of NaN to avoid NaN-ing out entire
+  sets of science data when backgrounds are missing. [#8597]
 
 master_background_mos
 ---------------------
@@ -354,6 +370,12 @@ residual_fringe
 
 - Use DQ plane to exclude pixels marked as DO_NOT_USE in correction. [#8381]
 
+srctype
+-------
+
+- Reset ``source_xpos`` and ``source_ypos`` values to zero for extended sources 
+  in NIRSpec FS data to enable assigning those attributes in ``extract_2d``. [#8569]
+
 saturation
 ----------
 
@@ -395,6 +417,9 @@ wavecorr
   Point source position is calculated from dither offsets only for standard
   fixed slit processing. [#8467]
 
+- Assign slit ``source_xpos`` and ``source_ypos`` attributes in ``extract_2d``
+  instead of in ``wavecorr`` for NIRSpec FS data. [#8569]
+
 wfss_contam
 -----------
 

diff --git a/docs/jwst/emicorr/arguments.rst b/docs/jwst/emicorr/arguments.rst
@@ -25,3 +25,6 @@ The ``emicorr`` step has the following step-specific arguments.
     This is to tell the code to do correction for the frequencies in
     the list with a reference file created on-the-fly instead of CRDS.
 
+``--use_n_cycles`` (integer, default=3)
+    Number of cycles to use to calculate the phase. To use all
+    integrations, set to None.
diff --git a/docs/jwst/extract_2d/main.rst b/docs/jwst/extract_2d/main.rst
@@ -69,6 +69,26 @@ corresponding to the FITS keywords "SLTNAME", "SLTSTRT1", "SLTSIZE1",
 "SLTSTRT2", and "SLTSIZE2."  Keyword "DISPAXIS" (dispersion direction)
 will be copied from the input file to each of the output cutout images.
 
+The "SRCXPOS" and "SRCYPOS" keywords in the SCI extension header of each slitlet 
+are also populated with estimates of the source
+x (dispersion) and y (cross-dispersion) location within the slitlet.
+For MOS data, these values are taken from the :ref:`MSA metadata file<msa_metadata>`.
+Fixed slit data do not have an *a priori* estimate of the source
+location within a given slit, so the estimated source location is
+computed by the ``extract_2d`` step. It uses the target coordinates in
+conjunction with the aperture reference point in V2/V3 space to
+estimate the fractional location of the source within the given slit.
+Note that this computation can only be performed for the primary slit
+in the exposure, which is given in the "FXD_SLIT" keyword. The positions
+of sources in any additional slits cannot be estimated and therefore
+are set to 0.0 (the center of the slit).\ :sup:`1`
+
+:sup:`1`\ Note that fixed slits that are planned as part of a combined
+MOS and FS observation do have *a priori* estimates of their source
+locations, via the :ref:`MSA metadata file<msa_metadata>`. When available,
+these source locations are directly used, instead of recomputing the source
+position from the WCS information.
+
 
 NIRCam and NIRISS WFSS
 ++++++++++++++++++++++

diff --git a/docs/jwst/outlier_detection/arguments.rst b/docs/jwst/outlier_detection/arguments.rst
@@ -24,6 +24,14 @@ that control the behavior of the processing:
   Any floating-point value, given as a string, is valid.
   A value of 'INDEF' will use the last zero weight flux.
 
+``--nlow`` (integer, default=0)
+  The number of low values in each pixel stack to ignore
+  when computing the median value.
+
+``--nhigh`` (integer, default=0)
+  The number of high values in each pixel stack to ignore
+  when computing the median value.
+
 ``--maskpt`` (float, default=0.7)
   The percent of maximum weight to use as lower-limit for valid data;
   valid values go from 0.0 to 1.0.

diff --git a/docs/jwst/outlier_detection/outlier_detection_imaging.rst b/docs/jwst/outlier_detection/outlier_detection_imaging.rst
@@ -69,6 +69,8 @@ Specifically, this routine performs the following operations:
 
    * The median image is created by combining all grouped mosaic images or
      non-resampled input data (as planes in a ModelContainer) pixel-by-pixel.
+   * The ``nlow`` and ``nhigh`` parameters specify how many low and high values
+     to ignore when computing the median for any given pixel.
    * The ``maskpt`` parameter sets the percentage of the weight image values to
      use, and any pixel with a weight below this value gets flagged as "bad" and
      ignored when resampled.

diff --git a/docs/jwst/outlier_detection/outlier_detection_tso.rst b/docs/jwst/outlier_detection/outlier_detection_tso.rst
@@ -21,6 +21,8 @@ a few variations to accomodate the nature of these 3D data.
    * The ``rolling_window_width`` parameter specifies the number of integrations over
      which to compute the median. The default is 25. If the number of integrations
      is less than or equal to ``rolling_window_width``, a simple median is used instead.
+   * The ``nlow`` and ``nhigh`` parameters specify how many low and high values
+     to ignore when computing the median for any given pixel.
    * The ``maskpt`` parameter sets the percentage of the weight image values to
      use, and any pixel with a weight below this value gets flagged as "bad" and
      ignored when the median is taken.
@@ -36,4 +38,4 @@ a few variations to accomodate the nature of these 3D data.
 #. The input data model DQ arrays are updated with the mask of detected outliers.
 
 
-.. automodapi:: jwst.outlier_detection.outlier_detection_tso
+.. automodapi:: jwst.outlier_detection.outlier_detection_tso
diff --git a/docs/jwst/wavecorr/description.rst b/docs/jwst/wavecorr/description.rst
@@ -41,26 +41,13 @@ produces a new wavelength corrected slit frame, ``wavecorr_frame``.
 
 NIRSpec Fixed Slit (FS)
 -----------------------
-Fixed slit data do not have an *a priori* estimate of the source
-location within a given slit, so the estimated source location is
-computed by the ``wavecorr`` step. It uses the target coordinates in
-conjunction with the aperture reference point in V2/V3 space to
-estimate the fractional location of the source within the given slit.
-Note that this computation can only be performed for the primary slit
-in the exposure, which is given in the "FXD_SLIT" keyword. The positions
-of sources in any additional slits cannot be estimated and therefore
-the wavelength correction is only applied to the primary slit.\ :sup:`1`
 
-The estimated position of the source within the primary slit (in the
-dispersion direction) is then used in the same manner as described above
+The source position within the primary slit is estimated based on the 
+target coordinates and aperture reference point during the 
+:ref:`extract_2d <extract_2d_step>` step. This estimated position (in the 
+dispersion direction) is used in the same manner as described above
 for MOS slitlets to update the slit WCS and compute corrected wavelengths
-for the primary slit.
+for the primary slit only. 
 
 Upon successful completion of the step, the status keyword "S_WAVCOR"
 is set to "COMPLETE".
-
-:sup:`1`\ Note that fixed slits that are planned as part of a combined
-MOS and FS observation do have *a priori* estimates of their source
-locations, via the :ref:`MSA metadata file<msa_metadata>`. When available,
-these source locations are directly used, instead of recomputing the source
-position in the ``wavecorr`` step.
diff --git a/jwst/emicorr/emicorr.py b/jwst/emicorr/emicorr.py
@@ -4,7 +4,7 @@
 
 import numpy as np
 import logging
-
+from astropy.stats import sigma_clipped_stats as scs
 from stdatamodels.jwst import datamodels
 
 log = logging.getLogger(__name__)
@@ -93,14 +93,16 @@ def do_correction(input_model, emicorr_model, save_onthefly_reffile, **pars):
     nbins = pars['nbins']
     scale_reference = pars['scale_reference']
     onthefly_corr_freq = pars['onthefly_corr_freq']
+    use_n_cycles = pars['use_n_cycles']
 
     output_model = apply_emicorr(input_model, emicorr_model,
                         onthefly_corr_freq, save_onthefly_reffile,
                         save_intermediate_results=save_intermediate_results,
                         user_supplied_reffile=user_supplied_reffile,
                         nints_to_phase=nints_to_phase,
                         nbins_all=nbins,
-                        scale_reference=scale_reference
+                        scale_reference=scale_reference,
+                        use_n_cycles=use_n_cycles
                         )
 
     return output_model
@@ -109,7 +111,8 @@ def do_correction(input_model, emicorr_model, save_onthefly_reffile, **pars):
 def apply_emicorr(input_model, emicorr_model,
         onthefly_corr_freq, save_onthefly_reffile,
         save_intermediate_results=False, user_supplied_reffile=None,
-        nints_to_phase=None, nbins_all=None, scale_reference=True):
+        nints_to_phase=None, nbins_all=None, scale_reference=True,
+        use_n_cycles=3):
     """
     -> NOTE: This is translated from IDL code fix_miri_emi.pro
 
@@ -166,6 +169,9 @@ def apply_emicorr(input_model, emicorr_model,
     scale_reference : bool
         If True, the reference wavelength will be scaled to the data's phase amplitude
 
+    use_n_cycles : int
+        Only use N cycles to calculate the phase to reduce code running time
+
     Returns
     -------
     output_model : JWST data model
@@ -216,10 +222,6 @@ def apply_emicorr(input_model, emicorr_model,
     # Initialize the output model as a copy of the input
     output_model = input_model.copy()
     nints, ngroups, ny, nx = np.shape(output_model.data)
-    if nints_to_phase is None:
-        nints_to_phase = nints
-    elif nints_to_phase > nints:
-        nints_to_phase = nints
 
     # create the dictionary to store the frequencies and corresponding phase amplitudes
     if save_intermediate_results and save_onthefly_reffile is not None:
@@ -265,6 +267,20 @@ def apply_emicorr(input_model, emicorr_model,
         # e.g. ((1./390.625) / 10e-6) = 256.0 pix and ((1./218.52055) / 10e-6) = 457.62287 pix
         period_in_pixels = (1./frequency) / 10.0e-6
 
+        if nints_to_phase is None and use_n_cycles is None:  # user wats to use all integrations
+            # use all integrations
+            nints_to_phase = nints
+        elif nints_to_phase is None and use_n_cycles is not None: # user wants to use nints_to_phase
+            # Calculate how many integrations you need to get that many cycles for
+            # a given frequency (rounding up to the nearest Nintegrations)
+            nints_to_phase = (use_n_cycles * period_in_pixels) / (frameclocks * ngroups)
+            nints_to_phase = int(np.ceil(nints_to_phase))
+        elif nints_to_phase is not None and use_n_cycles == 3:
+            # user wants to use nints_to_phase because default value for use_n_cycles is 3
+            # make sure to never use more integrations than data has
+            if nints_to_phase > nints:
+                nints_to_phase = nints
+
         start_time, ref_pix_sample = 0, 3
 
         # Need colstop for phase calculation in case of last refpixel in a row. Technically,
@@ -274,7 +290,7 @@ def apply_emicorr(input_model, emicorr_model,
         colstop = int( xsize/4 + xstart - 1 )
         log.info('doing phase calculation per integration')
 
-        for ninti in range(nints_to_phase):
+        for ninti in range(nints):
             log.debug('  Working on integration: {}'.format(ninti+1))
 
             # Remove source signal and fixed bias from each integration ramp
@@ -332,7 +348,7 @@ def apply_emicorr(input_model, emicorr_model,
                 # point to the first pixel of the next frame (add "end-of-frame" pad)
                 start_time += extra_rowclocks
 
-            # Convert "times" to phase each integration. Nothe times has units of
+            # Convert "times" to phase each integration. Note that times has units of
             # number of 10us from the first data pixel in this integration, so to
             # convert to phase, divide by the waveform *period* in float pixels
             phase_this_int = times_this_int / period_in_pixels
@@ -365,12 +381,15 @@ def apply_emicorr(input_model, emicorr_model,
         # Define the binned waveform amplitude (pa = phase amplitude)
         pa = np.arange(nbins, dtype=float)
         # keep track of n per bin to check for low n
-        nu = np.arange(nbins)
+        nb_over_nbins = [nb/nbins for nb in range(nbins)]
+        nbp1_over_nbins = [(nb + 1)/nbins for nb in range(nbins)]
+        # Construct a phase map and dd map for only the nints_to_phase
+        phase_temp = phaseall[0:nints_to_phase,:,:,:]
+        dd_temp = dd_all[0:nints_to_phase,:,:,:]
         for nb in range(nbins):
-            u = np.where((phaseall > nb/nbins) & (phaseall <= (nb + 1)/nbins) & (np.isfinite(dd_all)))
-            nu[nb] = phaseall[u].size
+            u = np.where((phase_temp > nb_over_nbins[nb]) & (phase_temp <= nbp1_over_nbins[nb]))
             # calculate the sigma-clipped mean
-            dmean, _, _, _ = iter_stat_sig_clip(dd_all[u])
+            dmean,_,_ = scs(dd_temp[u])
             pa[nb] = dmean   # amplitude in this bin
 
         pa -= np.median(pa)
@@ -431,15 +450,21 @@ def apply_emicorr(input_model, emicorr_model,
 
         # Interleave (straight copy) into 4 amps
         noise = np.ones((nints, ngroups, ny, nx))   # same size as input data
+        noise_x = np.arange(nx4) * 4
         for k in range(4):
-            noise_x = np.arange(nx4)*4 + k
-            noise[..., noise_x] = dd_noise
+            noise[:, :, :, noise_x + k] = dd_noise
 
         # Subtract EMI noise from the input data
         log.info('Subtracting EMI noise from data')
         corr_data = orig_data - noise
         output_model.data = corr_data
 
+        # clean up
+        del data
+        del dd_all
+        del times_this_int
+        del phaseall
+
     if save_intermediate_results and save_onthefly_reffile is not None:
         if 'FAST' in readpatt:
             freqs_dict = {readpatt: freqs2correct}
@@ -642,74 +667,6 @@ def get_frequency_info(freqs_names_vals, frequency_name):
                 break
         return freq_number, phase_amplitudes
 
-
-def iter_stat_sig_clip(data, sigrej=3.0, maxiter=10):
-    """ Compute the mean, mediand and/or sigma of data with iterative sigma clipping.
-    This funtion is based on djs_iterstat.pro (authors therein)
-
-    Parameters
-    ----------
-    data : numpy array
-
-    sigrej : float
-        Sigma for rejection
-
-    maxiter: int
-        Maximum number of sigma rejection iterations
-
-    Returns
-    -------
-    dmean : float
-        Computed mean
-
-    dmedian : float
-        Computed median
-
-    dsigma : float
-        Computed sigma
-
-    dmask : numpy array
-        Mask set to 1 for good points, and 0 for rejected points
-    """
-    # special cases of 0 or 1 data points
-    ngood = np.size(data)
-    dmean, dmedian, dsigma, dmask = 0.0, 0.0, 0.0, np.zeros(np.shape(data))
-    if ngood == 0:
-        log.debug('No data points for sigma clipping')
-        return dmean, dmedian, dsigma, dmask
-    elif ngood == 1:
-        log.debug('Only 1 data point for sigma clipping')
-        return dmean, dmedian, dsigma, dmask
-
-    # Compute the mean + standard deviation of the entire data array,
-    # these values will be returned if there are fewer than 2 good points.
-    dmask = np.ones(ngood, dtype='b') + 1
-    dmean = np.sum(data * dmask) / ngood
-    dsigma = np.sqrt(sum((data - dmean)**2) / (ngood - 1))
-    iiter = 1
-
-    # Iteratively compute the mean + stdev, updating the sigma-rejection thresholds
-    # each iteration
-    nlast = -1
-    while iiter < maxiter and nlast != ngood and ngood >= 2:
-        loval = dmean - sigrej * dsigma
-        hival = dmean + sigrej * dsigma
-        nlast = ngood
-
-        dmask[data < loval] = 0
-        dmask[data > hival] = 0
-        ngood = sum(dmask)
-
-        if ngood >= 2:
-            dmean = np.sum(data*dmask) / ngood
-            dsigma = np.sqrt( sum((data - dmean)**2 * dmask) / (ngood - 1) )
-            dsigma = dsigma
-
-        iiter += 1
-
-    return dmean, dmedian, dsigma, dmask
-
-
 def rebin(arr, newshape):
     """Rebin an array to a new shape.