JP-3654: NSClean subarray speedup #8745

t-brandt · 2024-08-30T18:55:20Z

This PR changes the behavior of NSClean in subarray mode. Previously, the code would fit all unilluminated pixels simultaneously. This required very large amounts of time and memory for larger subarrays. This PR changes that behavior to compute the correction in overlapping regions, resulting in large speedups with little change in the results. The reference pixels are no longer used in the correction, which helps with some ringing artifacts near the detector edges.

Checklist for PR authors (skip items if you don't have permissions or they are not applicable)

added entry in CHANGES.rst within the relevant release section
~~updated or added relevant tests~~
updated relevant documentation
added relevant milestone
added relevant label(s)
ran regression tests, post a link to the Jenkins job below.
How to run regression tests on a PR
All comments are resolved
Make sure the JIRA ticket is resolved properly

codecov · 2024-08-30T19:37:47Z

Codecov Report

Attention: Patch coverage is 3.84615% with 50 lines in your changes missing coverage. Please review.

Project coverage is 60.79%. Comparing base (8f6814f) to head (12f5d3e).

Files with missing lines	Patch %	Lines
jwst/nsclean/nsclean.py	2.12%	46 Missing ⚠️
jwst/nsclean/lib.py	20.00%	4 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #8745      +/-   ##
==========================================
- Coverage   60.85%   60.79%   -0.06%     
==========================================
  Files         373      373              
  Lines       38657    38690      +33     
==========================================
  Hits        23523    23523              
- Misses      15134    15167      +33

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

melanieclarke

Thanks for this! It looks good to me, just a couple small requests.

I will run regression tests and test this branch locally on some subarray.

It looks like there's a conflict in the change log. Can you please fix it?

jwst/nsclean/lib.py

jwst/nsclean/nsclean.py

melanieclarke · 2024-09-06T14:31:48Z

@t-brandt - I'd like to get this in if we can, since we're closing in on the end of the development period for the current build. Would you mind if I go ahead and make the last clean up changes?

Co-authored-by: Melanie Clarke <[email protected]>

t-brandt · 2024-09-06T15:40:20Z

Please do go ahead and do the last cleanup. Thanks!

melanieclarke · 2024-09-06T16:07:33Z

Testing locally, results look similar to previous, and processing time is greatly improved.

It's now possible to process subarray=ALLSLITS data, so I removed the restriction on that data type.

Test data was:
SUBS200A1: jw02757002001_03104_00001_nrs1_rate.fits
ALLSLITS: jw01128004001_0310g_00003_nrs1_rate.fits

melanieclarke · 2024-09-06T16:19:33Z

~~Regression tests started here: https://plwishmaster.stsci.edu:8081/job/RT/job/JWST-Developers-Pull-Requests/1690/~~

Tests aborted - looking at some results locally first.

melanieclarke · 2024-09-06T17:25:35Z

@t-brandt - Looking at local regression test results, I see a couple things we should maybe discuss some more.

While the ALLSLITS data runs through nsclean, results are quite poor with mask_spectral_regions=True (the default). I think this is just the known issue with with overfitting highly masked regions, and the workaround is to run with mask_spectral_regions = False and modify nsigma as needed. I don't think we need to address that in this PR, but we should consider different default values.

Also, the change to remove forcing the edges to be included in the fit (mask=True) seems to be introducing some edge effects in some cases. For the SUBS200A1 regression test data, it shows up only in the reference columns (they are not NaN in this older input rate data). For the ALLSLITS regression test data, it shows up in the next couple columns too.

Are the edge issues something we can fix in the subarray correction now? Or should we revert the change to the reference pixel masking?

Here's what the ALLSLITS regression test data looks like. Top is nscleaned, bottom is the rate file.

melanieclarke · 2024-09-06T17:53:25Z

For comparison, here's what the same data looks like with better masking (top) and with better masking + the reference edges included in the fit (bottom).

t-brandt · 2024-09-06T20:45:05Z

Maybe what I/we should do then is to zero out the 1/f correction in the reference pixels? Right now they are masked, and the correction is extrapolated into those pixels (which is a terrible approximation). Zeroing out the correction in those pixels is a one or two line change. I don't think leaving the reference pixels in the mask should be an option, since they should have already been used in the refpix step and will have no more information to add.

melanieclarke · 2024-09-09T13:48:55Z

That sounds worth trying - it's similiar to the way the full frame correction sets the correction to zero if there are no unmasked pixels in the column. For the ALLSLITS case, the edge effects show up in the first 6 columns, so refpix + 2 real science columns. Would that address those columns too?

t-brandt · 2024-09-09T21:09:24Z

Ok, I have pushed several small changes to fix the edge effects. These happened when a large, contiguous part of the area used for the correction was masked for fitting 1/f. In those cases, the matrices became very poorly conditioned. The approach now skips rows at the end without any available pixels for fitting 1/f.

melanieclarke · 2024-09-09T21:20:53Z

Local regression tests and spot checks with my other test data sets look good. Thanks for the fix!

I'll run the full regression test set now and hopefully we can get this merged!

Regression tests running here:
https://plwishmaster.stsci.edu:8081/job/RT/job/JWST-Developers-Pull-Requests/1697/

t-brandt · 2024-09-10T02:13:12Z

The overfitting present with the default parameters is because the range of frequencies that NSClean is fitting is too large when a large fraction of the array is missing due to the mask. This can be fixed by changing these frequencies (see the image below). If we want to actually implement this is will be a change of a few lines; I can do it tomorrow. I think the most natural way to implement the change would be to check for allslits mode and mask_spectral_regions set to True, and if both of these are set, change the frequency cutoffs from current the default values.

melanieclarke · 2024-09-10T13:43:25Z

@t-brandt - that sounds reasonable to me to include here, since this is the first time ALLSLITS data will be correctable with NSClean, and since it really is unusable with current default parameters.

t-brandt · 2024-09-10T15:02:45Z

Ok, the default behavior for ALLSLITS should now be reasonable.

melanieclarke · 2024-09-10T17:37:22Z

Regression tests restarted here:
https://plwishmaster.stsci.edu:8081/job/RT/job/JWST-Developers-Pull-Requests/1702/

melanieclarke · 2024-09-11T13:41:17Z

@hayescr - tagging you in to this conversation for NIRSpec's information.

This PR is the fix for subarray performance, which enables ALLSLITS subarrays to be corrected. The default correction frequencies worked very poorly when spectral regions are masked, so we are modifying the cutoff frequencies for that subarray specifically. We should discuss at a later date if we want to do something similar for the other modes that frequently get overcorrected, like MOS.

melanieclarke

Regression test results look as expected.

There are changes to nsclean for all NIRSpec data in spec2:

ALLSLITS data is corrected for the first time.
FS subarray corrections are slightly different due to the different algorithm, and due to removing the forced zeros at the edges of the array.
Full frame corrections for FS, MOS, and IFU are slightly different due to removing the forced zeros at the edges of the array.

Reviewing results locally, all nsclean corrections are visually similar to previous versions. There are also regression test changes for all products downstream of nsclean, as expected.

All of the remaining failures reported are unrelated.

I think this PR is ready to merge now. When it is in, I will incorporate the changes into the clean_flicker_noise step (#8669).

t-brandt · 2024-09-11T13:57:40Z

Great, thanks @melanieclarke! @hayescr and @melanieclarke, I think that the issue with overcorrection in various modes is best solved by exposing a single parameter (provisionally called aggressiveness) to the user. This would tune the frequencies to be corrected and the coupling of the four outputs. We could set a default with reasonable behavior, with more aggressive removal appropriate to cases where at least ~half of the array is available as a reference for the correction, and less aggressive removals appropriate to cases where most of the array is unavailable for a correction. I envision this applying to both subarray and full frame corrections.

melanieclarke · 2024-09-11T13:59:57Z

That sounds good to me @t-brandt - it would be nice to be able to tune that without asking the user to modify frequency cutoffs themselves!

I think we don't yet have a ticket for the aggressiveness parameter - can you please file one, and we'll continue the conversation there? Time is short on this build, but we should have plenty of time to work on it in the next one.

t-brandt · 2024-09-11T14:08:23Z

Sounds perfect. The ticket would modify the full frame NSClean behavior and create an aggressiveness parameter. I will make one.

hayescr · 2024-09-11T14:28:37Z

Thanks @melanieclarke and @t-brandt, agreed that we might want slightly different cutoff frequencies for different cases, perhaps other subarrays when the spectral regions are masked as well. I had a look at some of Tim's tests of setting this aggressiveness and there were some cases where different cases looked better with different values, so this would definitely be worth investigating. Feel free to tag me on that once you open it Tim, I'm happy to continue the conversation more there.

Timothy Brandt and others added 3 commits July 23, 2024 15:48

Implemented new subarray mode for NSClean

10d9f91

Merge branch 'spacetelescope:master' into nsclean_subarray_only

555737f

Masking reference pixels

2a49a7c

t-brandt requested a review from a team as a code owner August 30, 2024 18:55

t-brandt added 2 commits August 30, 2024 14:56

Added PR issue number

f1444c8

Moved CHANGES comment to correct release

acb0c15

stscijgbot-jp mentioned this pull request Aug 30, 2024

Modify NSClean behaviour for subarray data #8551

Closed

melanieclarke reviewed Sep 3, 2024

View reviewed changes

jwst/nsclean/lib.py Outdated Show resolved Hide resolved

jwst/nsclean/nsclean.py Outdated Show resolved Hide resolved

jwst/nsclean/nsclean.py Outdated Show resolved Hide resolved

jwst/nsclean/nsclean.py Outdated Show resolved Hide resolved

melanieclarke changed the title ~~Nsclean subarray speedup~~ JP-3654: NSClean subarray speedup Sep 4, 2024

Update jwst/nsclean/lib.py

155956d

Co-authored-by: Melanie Clarke <[email protected]>

melanieclarke added 3 commits September 6, 2024 11:46

Merge branch 'master' into nsclean_subarray_only

30c8d7d

Code style clean up

7dd9446

Remove restriction on ALLSLITS data

facf275

melanieclarke added this to the Build 11.1 milestone Sep 6, 2024

melanieclarke added the nsclean NIRSpec NSClean algorithm label Sep 6, 2024

Update docs for ALLSLITS change

edd74c4

github-actions bot added the documentation label Sep 6, 2024

Fix parameter name

19d46a0

melanieclarke mentioned this pull request Sep 9, 2024

JP-3639: Implement 1/f noise correction for ramp data #8669

Merged

8 tasks

Small changes to avoid artifacts near detector edges.

2608213

Merge branch 'master' into nsclean_subarray_only

b52bb05

Changed frequencies with ALLSLITS and mask_spectral_regions.

11c20f1

Changed default frequencies with ALLSLITS.

e0008ab

melanieclarke added 2 commits September 10, 2024 13:43

Merge branch 'master' into nsclean_subarray_only

3243b07

Merge branch 'master' into nsclean_subarray_only

12f5d3e

melanieclarke approved these changes Sep 11, 2024

View reviewed changes

melanieclarke merged commit 7b6b1a0 into spacetelescope:master Sep 11, 2024
28 of 29 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JP-3654: NSClean subarray speedup #8745

JP-3654: NSClean subarray speedup #8745

t-brandt commented Aug 30, 2024 •

edited by melanieclarke

Loading

codecov bot commented Aug 30, 2024 •

edited

Loading

melanieclarke left a comment

melanieclarke commented Sep 6, 2024

t-brandt commented Sep 6, 2024

melanieclarke commented Sep 6, 2024 •

edited

Loading

melanieclarke commented Sep 6, 2024 •

edited

Loading

melanieclarke commented Sep 6, 2024

melanieclarke commented Sep 6, 2024

t-brandt commented Sep 6, 2024 •

edited

Loading

melanieclarke commented Sep 9, 2024

t-brandt commented Sep 9, 2024

melanieclarke commented Sep 9, 2024 •

edited

Loading

t-brandt commented Sep 10, 2024

melanieclarke commented Sep 10, 2024

t-brandt commented Sep 10, 2024

melanieclarke commented Sep 10, 2024

melanieclarke commented Sep 11, 2024

melanieclarke left a comment •

edited

Loading

t-brandt commented Sep 11, 2024

melanieclarke commented Sep 11, 2024 •

edited

Loading

t-brandt commented Sep 11, 2024

hayescr commented Sep 11, 2024

JP-3654: NSClean subarray speedup #8745

JP-3654: NSClean subarray speedup #8745

Conversation

t-brandt commented Aug 30, 2024 • edited by melanieclarke Loading

codecov bot commented Aug 30, 2024 • edited Loading

Codecov Report

melanieclarke left a comment

Choose a reason for hiding this comment

melanieclarke commented Sep 6, 2024

t-brandt commented Sep 6, 2024

melanieclarke commented Sep 6, 2024 • edited Loading

melanieclarke commented Sep 6, 2024 • edited Loading

melanieclarke commented Sep 6, 2024

melanieclarke commented Sep 6, 2024

t-brandt commented Sep 6, 2024 • edited Loading

melanieclarke commented Sep 9, 2024

t-brandt commented Sep 9, 2024

melanieclarke commented Sep 9, 2024 • edited Loading

t-brandt commented Sep 10, 2024

melanieclarke commented Sep 10, 2024

t-brandt commented Sep 10, 2024

melanieclarke commented Sep 10, 2024

melanieclarke commented Sep 11, 2024

melanieclarke left a comment • edited Loading

Choose a reason for hiding this comment

t-brandt commented Sep 11, 2024

melanieclarke commented Sep 11, 2024 • edited Loading

t-brandt commented Sep 11, 2024

hayescr commented Sep 11, 2024

t-brandt commented Aug 30, 2024 •

edited by melanieclarke

Loading

codecov bot commented Aug 30, 2024 •

edited

Loading

melanieclarke commented Sep 6, 2024 •

edited

Loading

melanieclarke commented Sep 6, 2024 •

edited

Loading

t-brandt commented Sep 6, 2024 •

edited

Loading

melanieclarke commented Sep 9, 2024 •

edited

Loading

melanieclarke left a comment •

edited

Loading

melanieclarke commented Sep 11, 2024 •

edited

Loading