-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
group_id
in romancal
#1259
Comments
https://github.com/spacetelescope/jwst/blob/master/jwst/datamodels/container.py#L102-L129 Is there a reason not to take advantage of this? |
This issue is tracked on JIRA as RCAL-852. |
Thanks Brett. Broadly, I suspect that we should use this, with exposures defining the group; it would sound like that would mean adding group_id = L2 file name minus the suffix and the SCA name. Some more specific comments:
|
Thanks! To check that I'm following the discussion so far:
Since there are 2 uses for grouping (outlier detection and resample) and the code currently expects the grouping to be done in the container it sound like it makes sense to keep the grouping API on the container. This also has the benefit that if another step wants to used the groupings it's already part of the common code (as it is now with One additional question is do we want to support having the group id in the association? As @nden mentioned this is much more efficient (currently |
Looking at the group id calculation the current
Looking at the RAD schemas, would using |
Yes. Re sky-matching, we're not intending to fit a single offset per exposure at present, but there is a lot of uncertainty about what level of flexibility the sky matching will end up needing to have. So if it's very easy to have an option to either group by exposure or do individual SCAs, that would be nice. But it sounds like we're following our intended baseline.
That's fine; I don't feel strongly here. The only vague awkwardness about defining it in the file is that I really don't see many cases where we want the group to be something other than an exposure. So there's a little redundancy, but if that redundancy leads us to match Webb it's probably worth it. I can confirm that recently all L1 files produced by romanisim started including obs_id, and that it looks to me to include all we need to identify an exposure. A potential implementation concern is that it would be easy to try to make some fake exposures by copying real exposures and changing the file names, believing that one was generating exposures that would be treated as separate exposures. But if we fail to update obs_id you could be surprised when they're all part of the same exposure. Or, relatedly, we keep the same information in the actual file name, in meta.filename, and in obsid, and it can be hard to keep these in sync. So I might prefer a default group based on the first N characters of the file name. But I don't feel strongly here. |
Keeping things in sync is definitely a concern. Perhaps this is off topic but it looks like |
Right, obs_id is set in romanisim during the creation of the L1 files and then not changed by romancal. Yes, I would be sympathetic to removing meta.filename, but I'm told stpipe cares about it and so that would be a bigger lift. I do think there's a use to having an overall exposure id (obs_id presently?) in the downstream archive, etc., so I'm not really militating for getting rid of obs_id. |
Thanks! Good point about stpipe and |
@braingram I supposed this issue has been resolved by ModelLibrary, correct? |
Thanks! Yeah. I'll close the issue. |
Part of this closed issue was what Roman wants to do for grouping. I think conceptually the answer is:
|
I reopened this as I think what you described doesn't match what's currently used in romancal. ModelLibrary matched the previous grouping by ModelContainer and now generates a group_id using: romancal/romancal/datamodels/library.py Lines 79 to 81 in 1945dad
which is used for all steps that use grouping. The steps that use grouping are:
This doesn't quite match There is also no grouping in sky match. Is that the same as "group by SCA"? |
Thank you! It looks to me like obs_id is more correct; i.e., the current approach will group exposures that have different execution plans, passes, and segments, but the same other keywords. The way that I think about this, Roman data is a stream of exposures, each of which has 18 SCAs. If I controlled the Roman file names, they would just be r{EXPID}_wfi{SCA}.asdf, where EXPID just incremented exposure by exposure. We've chosen a more complicated structure for EXPID that has a lot of different pieces that we have endowed with some meaning, but it's rather artificial and we don't want the grouping to know about it. I expect library.py should just use obsid. No grouping in sky match is the same as grouping by SCA, since each WfiImage is one SCA, so no changes there, thanks. |
Switching the grouping to I started a draft PR here: |
For some operations (tweakreg, skymatch, resample) models are grouped based on a
group_id
.This is currently defined in
ModelContainer.models_grouped
:romancal/romancal/datamodels/container.py
Lines 498 to 504 in ed6187f
which assigns a computed value to
model.meta.group_id
models_grouped
is used in:Currently group id is always defined in
models_grouped
ignoring anygroup_id
in the model metadata (and ignoring anygroup_id
in the association, which is used for jwst).Most of the above steps are part of the mosaic pipeline with the only exception being tweakreg which is part of the exposure level pipeline.
There is no test for the exposure level pipeline that doesn't skip tweakreg so it is difficult to guess what should be done here.
For the mosaic pipeline it is possible the mosaic will contain models from different groups which will result in:
ResampleData
so this is likely always the case). Duringmany_to_one
each model is drizzled separately however the grouping is used for getting the number of pointings and for updating the exposure times.Some questions I have:
The text was updated successfully, but these errors were encountered: