Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Commons metadata / rework bands object #762

Closed
cholmes opened this issue Apr 12, 2020 · 10 comments
Closed

Remove Commons metadata / rework bands object #762

cholmes opened this issue Apr 12, 2020 · 10 comments
Assignees
Milestone

Comments

@cholmes
Copy link
Contributor

cholmes commented Apr 12, 2020

Opening an issue for discussion that came up on our last STAC call - considering removing the 'commons' extension. A few reasons came up:

  • It's confusing, to have 'commons extension' and 'common metadata'
  • We now have 'summaries' which are a cleaner way to do one of the things commons metadata was used for.
  • The only case it seems really useful is the 'bands' object.

The key thing to do is to figure out another way to handle the bands. I wonder if one option is to just make it something that can be added at the collection level?

@cholmes cholmes added this to the 1.0.0-beta1 milestone Apr 12, 2020
@matthewhanson
Copy link
Collaborator

Some additional thoughts:

The Commons extension served two purposes originally.

One was to present users with what fields were the same across all Items in a Collection. As @cholmes points out the summaries extension now does that in a better way. Fields with a single value have a summary of one value, but can also includes summaries of all the fields.

The other was to avoid redundancy. However, in practice there's really only a couple fields (with the exception of eo:bands) that this ends up being used for, such as instruments or constellation. This is not much savings. And in an API it makes more sense to not use commons as it makes querying extremely difficult. Furthermore, the confusion to the user when they look at an Item and it does not actually contain all the values does not seem worth it.

The additional issue with the eo:bands properties is that it's a list of objects, so it's not queryable in any way currently. I also find it confusing to map out assets to bands.

I would propose two related changes:

  • return the Assets definition to core as an integral part of a collection
  • define spectral bands using the eo extension in the asset definition. Each asset will have an associated spectral_bands array of objects indicating the spectral bands in that asset. This might introduce some redundant fields (multiple assets which contain the same band), but since this is at the collection level and note items this really isn't a big deal.

I suspect @m-mohr might have a problem with this since he has a use case where he doesn't have Items, but in that case I'm wondering how you use the eo:bands object anyway.

@m-mohr
Copy link
Collaborator

m-mohr commented Apr 13, 2020

If you are exposing data as cloud API / service as Google and openEO do, you don't really want to expose the structure of the assets to users so that's why assets won't work well. What we do is to just specify the bands in the summaries as with all other fields. (+ openEO uses the data cube extension and bands are a dimension.)

So, just having bands in assets is not ideal and we discussed last call that the "properties"/"summaries" should still include something if there's a field specified in the assets (like with gsd). Still thinking myself about the best solution...

@cholmes
Copy link
Contributor Author

cholmes commented Apr 14, 2020

We talked about this on the 4/13 call and agreed that the bands in assets is not ideal, and the path forward is just to move the bands object to the collection level. The structure would stay the same, it'd be defined in the extension (so not something that confuses people who are using stac for non-eo purposes), and it would live as a top level catalog thing. There is precedence already for other extensions defining top level collection fields.

@m-mohr
Copy link
Collaborator

m-mohr commented Apr 14, 2020

I forgot about that yesterday: What happens for Items that don't have collections? Do they still list bands in properties? I'm just thinking about the use case Seth pushed earlier (single items) and I'm using now in openEO, too. That's basically what we added providers and license for in common metadata. Having bands only at the collection level would break the use case.

Other than that, I'm very happy to remove Commons. It solves so many ugly issues...

@cholmes
Copy link
Contributor Author

cholmes commented Apr 14, 2020

I thought Seth had been fine with breaking that use case when we got the single file STAC extension. Like to just have one file where you have a collection + item (though I forget if it ever achieved that goal).

If we do want to support it I suppose we could just say that bands object can also be inserted at the properties level.

@m-mohr
Copy link
Collaborator

m-mohr commented Apr 14, 2020

I thought Seth had been fine with breaking that use case when we got the single file STAC extension. Like to just have one file where you have a collection + item (though I forget if it ever achieved that goal).

So we can also remove license and providers from the common metadata? I like that... ;-)
Whatever we do, it should be consistent. So if we add bands to collections and not to items, I'd remove at least providers (and probably also license).

If we do want to support it I suppose we could just say that bands object can also be inserted at the properties level.

Agreed.

@cholmes
Copy link
Contributor Author

cholmes commented Apr 14, 2020

So we can also remove license and providers from the common metadata? I like that... ;-)

Yeah, I had thought we got there, though perhaps am misremembering, or maybe I dreamed it.

@matthewhanson
Copy link
Collaborator

After discussion this morning with @anayeaye @jbants and @m-mohr we came to a more flexible solution here for bands object.

We're closing #785 without merging. eo:bands stays at the Item properties level, which makes it useful for eventual search or inclusion into summaries. With #760 the eo:bands can be specified for a specific asset, so the index referencing can be removed. If you want to specify what bands an asset has you don't reference another array you just define an array of those bands.

Finally, we will issue a new proposal for extending the asset definitions to do something similar to the Commons extension that is getting removed, but in much more limited fashion.

@cholmes
Copy link
Contributor Author

cholmes commented May 1, 2020

Just a note - whenever you do the commons extension removal be sure to remove this paragraph in the collection spec:

A group of STAC Item objects from a single source can share a lot of common metadata. This is especially true with satellite imagery that uses the STAC EO or SAR extension. Rather than including these common metadata fields on every Item, they can be provided in the properties of the STAC Collection that the STAC Items belong to.

It's the second paragraph. Not sure how that got in there without even an indication that you need an extension to do this (unless I'm reading it wrong?).

@m-mohr
Copy link
Collaborator

m-mohr commented May 1, 2020

I guess we forgot to change this when moving this behavior from core to the extension.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants