-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unique ids #883
Unique ids #883
Conversation
Just put this up for discussion. Wondering if we should be more explicit about what makes a ID globally unique? Like is it collection ID plus the URL of one of the provider roles? Should that role be producer? Or host? Host is the one where we say explicitly there should just be one, so perhaps that makes the most sense. |
I think these constraints would solve this:
I'd advocate for not having to parse out Provider information in order to have collection IDs unique; providers aren't required and it would make the constraint more simple across catalogs and collections if the ID was required to have all information necessary for it to be globally unique (i.e. the STAC creator can insert the provider name into the IDs if that's needed to make them unique). |
How would it align with radiantearth/stac-api-spec#36? P.S. I know that it is a bit unclear what would happen with the Aggregation Extension, but still decided to ask this question. |
@pomadchin I'm unfamiliar with the aggregation extension, but from a read through it seems like it returns information about a range of items and not the items specifically - is that right? Or are you saying that the aggregation would return an actual Would this work if the aggregation extension, if having to return references to individual items, would be required to identify the items by both their collection ID and item ID? |
@lossyrob It is a bit unclear (since that was just an oral conversation), but from what I understand it can be an actual It would definitely work if items would be identified by both collection ID and item ID in a such collection. 👍 |
I'm fine with @lossyrob proposal, except that I would say:
Reason is that if you combine other catalogs/collection in new catalogs, then you can't always garantuee the uniqueness. Also, what is actually the root catalog? If there are two independant catalogs and I link to them from a new catalog, what is then the root? Re aggregations: I understood that aggregations could be Collections, but I don't think they should duplicate IDs. They are somewhat "virtual" anyway. |
I don't think this is restrictive enough. The benefit of having globally unique collections and catalogs is that they can always be referenced by their IDs, and items can be globally referenced by their |
Good point regarding APIs. But then I'd say they need to be unique for the parent collection, which is usually different to the root catalog. That works for APIs and makes combining several collections into whatever catalogs easier. You just can't always guarantee uniqueness if you combine different sources, like Matt does with Earth Search or I do with STAC Index (which itself are catalogs again). If there's no collection, then it seems it must be the root catalog though. |
Picking this up after a long delay... And I'm confused as to who is advocating what. The core recommendation seems to be id's should be unique for the parent collection. And we strongly recommend that. If items don't have a parent collection then it seems like they do need to be unique in the 'root'. There is the case where some meta catalog wants to include a catalog that doesn't have collections, and thus can't guarantee that the id is unique. I'd say for that we just warn people that if they define a items without collections then their catalogs just won't be nearly as used. |
Ah, now I see that I was confused, was mixing up collections and items. I just committed an attempt at this. Basically said that id's in a collection need to be unique, and that collection id's should aim to be globally unique. And handled the 'no collection' use case by saying that Items should attempt to make their id's globally unique (and remind people we strongly recommend a catalog). |
|
||
In general, STAC versions can be mixed, but please keep the [recommended best practices](../best-practices.md#mixing-stac-versions) in mind. | ||
|
||
#### id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd align with the order in the table and move this below the stac_extensions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cholmes I guess you've not seen this?
I'm fine with unique Item ids in a collection. I have issues with the "collection id" must be globally unique. I see that collection id's should be unique across a provider, but globally uniqueness no one can really guarantee. I don't think there will be broad adoption, especially as the PR doesn't say why this is useful. Additionally, most people have their collection IDs already defined before adopting STAC and I don't think many people will change them (at least I don't see anyone in openEO doing it, I doubt GEE would do that - the obvious reason is that we use them in processing workflows and different IDs just in STAC would break things and confuse users). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added suggested changes, otherwise +1
Co-authored-by: Rob Emanuele <[email protected]>
Co-authored-by: Rob Emanuele <[email protected]>
Ok, merging this. Committed Rob's changes, as Matt, Rob and Matthias all agree on not having the 'globally unique' stuff for collections. |
Related Issue(s): #1011 #822
Proposed Changes:
PR Checklist: