Datasets with composite products VS datasets with imagery #45

danielfdsilva · 2022-03-09T10:08:17Z

It is planned to have datasets available in two different flavors.

⚠️ Nomenclature is still a bit fuzzy. @leothomas @anayeaye if you know better/official names for these let us know.

Composite products

Defined by a product that was already processed and is shown to the user with several variations. An example would be "NO2" or "CIMP6".
The different variations are available in the form of layers that can be seen on the map and may have compare options.

🤓 In STAC terms this means that the user can see different collections.

Interface	Dataset configuration
	id: no2 name: Nitrogen Dioxide layers: - id: no2-monthly name: Avg ... - id: no2-diff name: Diff ...

Imagery

Defined by "raw satellite imagery" whose different bands can be manipulated. An example could be "Landsat 8" data.
In this dataset the user can only explore one layer, but can view it in different ways by changing the bands.

🤓 In STAC terms this means that the user can see only 1 collection tiles according to the provided band params.

Interface

Dataset configuration

We still have to think about how to configure a dataset that is showing an imagery layer, but could be something along these lines:

id: ls8
name: Landsat 8

layer:
  # Id needed for the stac collection
  id: landsat8
  # Different presets chosen from a list of options.
  # These are shown in the interface alongside "Custom"
  presets:
    - natural
    - false-color

@ricardoduplos does this capture the conversation we had yesterday. cc @olafveerman

@leothomas @anayeaye Does this make sense backend wise?

olafveerman · 2022-03-09T10:34:36Z

@danielfdsilva I think it would be useful to articulate why this distinction is necessary in the interface & configuration. What problem does it solve?

Interaction on 'composite' products

The different variations are available in the form of layers that can be seen on the map and may have compare options

Do you foresee other ways that people can interact with the data beyond compare?
For example, for CO2 we've spoken about two potential ways to visualize the product:

with a scale that uses the full dataset's min-max. This highlights year on year growth and shows how CO2 builds up over time
with a scale that uses a narrower min-max. This is useful to look at the strong seasonal cycles within a year

I can imagine this is a custom preset available to CO2.

Nomenclature

Regarding nomenclature. Is this similar to the EOSDIS Data Processing Levels, and if so, should we use that?

danielfdsilva · 2022-03-09T12:23:19Z

@olafveerman we're accounting for the possibility of dynamically changing the rescale value based on the date where different dates would have different values.

However this is automatic (in a way) and does not depend on user input. Changing this behavior to be a knob (or settings for a layer) is something that could be considered

anayeaye · 2022-03-09T17:36:16Z

Thanks for this breakdown, this does make sense to me backend wise. I wrote down some more stac-specific thoughts on the dataset flavors (practice for what corresponding documentation might go into the ingest pipelines). I have a slight (not strong) preference on the naming convention but I think your overview is clear either way. I also think we can align with titiler on the presets configuration.

tl;dr

Composite Products → Just Products or Metrics?
Imagery → Spectral or Multi Band?
Presets: Check out the developing "defaults" concept in titiler-pgstac which might be something we can leverage if we store this kind of preset recipe info in config, or collections, or both

Product Datasets

Derived or modeled data that represent an interpretable metric, such as no2 or no2-diff, which are currently stored in a single band asset named cog_default. As far as describing a related set of collections, I feel like we may not want to use "composite" because it is a little loaded in remote sensing. Using processing level language is interesting (I guess no2 is Level 4?) but I'm not quite sure how that goes.

Can this just be "products" which are viewable as one or more layers?

Spectral Datasets, Multi Band Datasets, or Imagery Datasets

Raw multi band data that generally require preset or default band combinations to display a meaningful metric such as NDVI. Spectral/imagery collections like Landsat 8 will always be ingested using the eo extension. The extension is more relevant to science users but I am providing a link in case any of the descriptive information is useful for us.

These collections will have multiple band assets that will be summarized in the collection's "item_assets" property. We do not yet have any notion of storing stac information about how the individual bands should be assembled into meaninful products but...

Defaults/Presets

"presets" are similiar to titiler-pgstac "defaults" so it may be useful to align on this. I think we should consider storing the information for default band combinations (product recipes) in the Collection metadata as well as delta-config. I agree that defaults/presets for a predefined min/max rescale range could be useful here, too, to enhance the stretch for specific purposes that might fade when stretched to the full collection range.

<SNIP>
// OPTIONAL. Default: null. A set of `defaults` configuration.
    "defaults": {
        "true_color": {
            "assets": ["B4", "B3", "B2"],
            "color_formula": "Gamma RGB 3.5 Saturation 1.7 Sigmoidal RGB 15 0.35",
        },
        "ndvi": {
            "expression": "(B4-B3)/(B4+B3)",
            "rescale": "-1,1",
            "colormap_name": "viridis"
        }
    }

sharkinsspatial · 2022-03-16T23:05:38Z

@danielfdsilva @anayeaye This is awesome. I'm commenting here as @abarciauskas-bgse cc'ed me on a related issue. It seems questions around how to model your STAC data store and then link it to the interface seem to be distributed across a few issues in delta-ui and delta-backend repositories. To simplify things here I'd like to focus the question in this issue about how to expose tiling conventions for STAC collections to the ui. The biggest unknown is about what information lives where in these 3 locations

STAC collection endpoint
UI configuration yaml
pgstac-tiler mosaic endpoints

I like @danielfdsilva 's breakdown (I agree with @anayeaye that some of the terminology could be tweaked but conceptually this looks great.) One way to view this is to start at the bottom (data) and move up layers of abstraction to the UI configuration.

Let's use HLS L30 as a Multiband Imagery example. The dataset's underlying asset structure and data values are what drives the need for expression, color_formula and rescale values used by the tiler. In my mind this means that as @anayeaye mentioned we should be storing and advertising this information at the collection level (this makes the assumption that all of the items in your collection share a common asset structure and underlying data value consistency). Modeling this information in STAC has been proposed in the draft composite extension and could be easily applied to existing collections. I'm not fully sure of the implications of using this extension at the collection level rather than the item level but this should be feasible.

With this information advertised by the STAC API the UI config yaml can simply list collection ids and naming (but could be extended as well to only allow particular virtual:asset. For example this would be a sample (truncated) HLS L30 collection

{
    "id": "HLSS30",
    "type": "Collection",
    "virtual:assets":{
     "NDVI": 
     {
       "href": [ "#B04", "#B05" ],
       "title": "Normalized Difference Vegetation Index",
       "processing:expression": {
       "format": "rio-calc",
       "expression": "(B05–B04)/(B05+B04)"
    },
    "raster:bands": [
      { 
        "statistics": {
          "minimum": -1,
          "maximum": 1
        }
      }
    ]
  }
}

With a corresponding UI configuration yaml

id: hlsl30
name: HLS L30

layer:
  # Id needed for the stac collection
  id: HLSS30
  # Different presets chosen from a list of options.
  # These are shown in the interface alongside "Custom"
  presets:
    - NDVI

The UI can then query the STAC API collection at runtime to determine the corresponding expression and apply it to the tile layer's url template.

This should work well in most use cases but unfortunately url expressions are not sufficiently rich for all common rendering use cases. More complex rendering functions can be injected into titiler using the DatasetParams plugin approach NASA-IMPACT/eoAPI@8457367. Currently, there is no approach for advertising these plugins via the titiler endpoints so we would most likely need to hardcode deployed plugins into the UI configuration yaml. This does leave us open to potential UI - API synchronization issues but these are omnipresent anyway.

abarciauskas-bgse · 2022-03-25T19:25:17Z

🙌🏽 Great discussion on this topic. To summarize: Given we have use cases where we need to include Spectral Datasets, Multi Band Datasets, or Imagery Datasets layers in the dashboard (specifically HLS for the environmental justice story) we should have some way for those datasets to have one or more possible band combinations (Natural Color, NDVI, etc).

Next steps:

Use the composite extension https://github.com/stac-extensions/composite#dynamic-tile-servers-integration for HLS collection metadata
Allow for the presets configuration yaml in delta-config, which presumably would allow for multiple layers for that imagery dataset in the dashboard (one for each preset). When a user selects a layer, the front end will call the STAC API to load the virtual assets expression from the collection metadata then call the dynamic tiler with that expression.

Future work may include additional complex rendering functions to be created and injected into titiler, following Sean's example.

A few implementation questions:

Do we need any additional configuration in the dataset yaml to identify composite vs imagery datasets?
Do we need to specify the presets if we just want to include all virtual assets as options to the user?

aboydnw · 2024-05-20T19:16:20Z

This ticket is pretty old, although my sense is that, even if the needs and use cases have changed, this still represents a gap in the dashboard's functionality. I'm going to leave open and it might relate to some future work on our roadmap about accommodating more science use cases, but @anayeaye @abarciauskas-bgse @hanbyul-here can you let me know if there is some more immediate work that needs to get done here?

aboydnw · 2024-08-16T19:39:33Z

Closing as stale, let me know if we need to reopen

This was referenced Mar 10, 2022

STAC Collection Creation Conventions (Dashboard Specific) NASA-IMPACT/veda-backend#29

Closed

Define HLS layers for environmental justice story NASA-IMPACT/veda-data-pipelines#89

Closed

sharkinsspatial mentioned this issue Mar 29, 2022

[WIP] Define best practices for defining collections when produced for specific stories NASA-IMPACT/veda-data#58

Open

anayeaye mentioned this issue Jun 7, 2022

Optional assume role in raster-api for external S3 bucket read permissions NASA-IMPACT/veda-backend#56

Merged

aboydnw closed this as not planned Won't fix, can't repro, duplicate, stale Aug 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Datasets with composite products VS datasets with imagery #45

Datasets with composite products VS datasets with imagery #45

danielfdsilva commented Mar 9, 2022

olafveerman commented Mar 9, 2022

danielfdsilva commented Mar 9, 2022

anayeaye commented Mar 9, 2022

sharkinsspatial commented Mar 16, 2022 •

edited

Loading

abarciauskas-bgse commented Mar 25, 2022 •

edited

Loading

aboydnw commented May 20, 2024

aboydnw commented Aug 16, 2024

Datasets with composite products VS datasets with imagery #45

Datasets with composite products VS datasets with imagery #45

Comments

danielfdsilva commented Mar 9, 2022

Composite products

Imagery

olafveerman commented Mar 9, 2022

Interaction on 'composite' products

Nomenclature

danielfdsilva commented Mar 9, 2022

anayeaye commented Mar 9, 2022

tl;dr

Product Datasets

Spectral Datasets, Multi Band Datasets, or Imagery Datasets

Defaults/Presets

sharkinsspatial commented Mar 16, 2022 • edited Loading

abarciauskas-bgse commented Mar 25, 2022 • edited Loading

aboydnw commented May 20, 2024

aboydnw commented Aug 16, 2024

sharkinsspatial commented Mar 16, 2022 •

edited

Loading

abarciauskas-bgse commented Mar 25, 2022 •

edited

Loading