Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using opsim cadence #29

Open
rbiswas4 opened this issue Jul 28, 2023 · 4 comments
Open

Using opsim cadence #29

rbiswas4 opened this issue Jul 28, 2023 · 4 comments
Assignees
Labels
enhancement New feature or request

Comments

@rbiswas4
Copy link

This is in response to the question raised by @sibirrer as to whether cadences from OpSim can be obtained by OpSimSummary for use in the sim-pipeline software.

So, OpSimSummary is a public software in DESC and therefore certainly available. It was originally written to enable supernova simulations based on LSST cadence.

Its basic functionality (let us call this functionality 1) is to provide the set of opsim visits over the full LSST survey that would contain a particular point (e.g. a transient location), and then obtain the full set of metadata describing that visit.If finding these visits is performed by a simple distance check around the point, it requires going through ~ 2-2.5 million rows (roughly the number of visits in any opsim output) for each transient. This is made a little faster using trees.

For SNANA rwork, the list of visits and their metadata in a particular format is required for a discrete set of points in a file usually called simlib. This can be generated using the above functionality 1. Let us call simlib generation functionality 2.

Some of the recent changes:

  • as we know OpSim output formats have changed over time. Currently there is a branch which has been used in the paper we saw which works (basic functionality described above) with opsim versions 2.99 and 3.0, which do not have a proposal table (a table with different kinds of surveys DDF, WFD etc. as was done in the past.
  • I am trying to check that simlib generation also works in these versions treating the DDF and other kinds of fidelds the same.
  • @chrisfrohmaier has submitted a PR with some MAF based code to essentially set up the equivalent of the proposal table. And then run opsimsummary using the identifications of fields. I plan to merge this here as well.
  • I also want to update the documentation a bit (which should include a snippet of code to run this on the latest versions of opsim).

I will also want to take a look at the gg-lensing branch you pointed to ... Thanks !

Questions I have about your use of Opsim:

For the strong lenging sim-pipeline case:

  1. Do you intend to use functionality 1? Or 2?
  2. Do you care about knowing which observation is DDF or somehting else?
@rbiswas4
Copy link
Author

If you go with 1. I think the kind of example that would be of help (updated to the latest cadence) is https://github.com/LSSTDESC/OpSimSummary/blob/Issue%23325/proposalTables/example/Demo_SynOpSim.ipynb.

Obviously it needs an update to work with current versions (which happens in the branch I mentioned).

@sibirrer
Copy link
Contributor

Thank you very much @rbiswas4 ! This is super helpful and we are going to look into it and reach out if/when we have questions.

On your questions:

  1. Certainly option 1 would be helpful. I wonder if there is also a version to create a set of observation sequences for a given point on-the-fly with monte-carlo processes (instead of querying a long table asking for matched observations) for speed-up, such that we can produce a lot of training sets of time-series images. If option 1 comes with PSF properties, noise properties etc, we might be able to simulate single-visit images from there, but if simlib is flexible to also allow to inject other irregular sources (such as lensed arcs in GalSim format or as sharp images), then option 2 might also be very useful.

  2. Yes, as lens searches might be different in the DDF and regular survey, effectively needing different training sets and cuts.

And in more general terms: we hope that the sim-pipeline will be able to simulate all sorts of lensed transients at the population level, and as such are mostly interested in how you want to use lensed SNe simulations (or other transients). The package is aimed to be modular and we are building different source and lens populations.

I am linking here @nkhadka21 and this might come up soon once we are starting to build the simulations for transients.

@sibirrer sibirrer added the enhancement New feature or request label Jul 28, 2023
@rbiswas4
Copy link
Author

Thank you @sibirrer

I wonder if there is also a version to create a set of observation sequences for a given point on-the-fly with monte-carlo processes (instead of querying a long table asking for matched observations) for speed-up, such that we can produce a lot of training sets of time-series images.

So, if I understand what you are thinking about obtaining an observation sequence which is not taken from the OpSim database but has the same statistical properties as observations in the OpSim database. No, I don't have anything like that.

Are you imagining this for speedup? If the rest of your simulation pipeline is fast enough that this is a bottleneck,
we could run opsimsummary on a discrete set of points sampled from the LSST footprint and save the results to disk. During your sim pipeline, you could simply read off the saved file. This is what the simlib idea (functionality 2) is (aside from things like file formats). This would be super fast, but of course not on the fly like using functionality 1.

Or are you thinking about this to create a very large number of 'independent' samples? In which case, you need something like a sampler ?

@sibirrer
Copy link
Contributor

@rbiswas4 thanks a lot! Ah, indeed so this would be a feature of simlib then. We are certainly planing for large 'independent' time series, but we might also be able to patch them together from a smaller set of discrete observations points to enhance them with enough 'independence'.

For very large training sets/series, we might still want a on-the-fly generation but perhaps it's not needed. So we can keep these options in the books.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants