Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is stac_version not in default_includes? #253

Closed
fwfichtner opened this issue Sep 7, 2021 · 8 comments
Closed

Why is stac_version not in default_includes? #253

fwfichtner opened this issue Sep 7, 2021 · 8 comments

Comments

@fwfichtner
Copy link

fwfichtner commented Sep 7, 2021

I might be missing something, but why is stac_version not part of default_includes?

This way you always need to specify it as a field when searching if you want to determine the stac object with pystac because it is expected (here or here).

curl http://localhost:5442/collections/dem90/items/Copernicus_DSM_COG_30_N00_00_E006_00_DEM would give me the item like expected, but when searching, e.g. curl http://localhost:5442/search?ids=Copernicus_DSM_COG_30_N00_00_E006_00_DEM stac_version is getting removed here.

Python example:

from pystac_client import Client

catalog = Client.open("http://localhost:5442/")

mysearch = catalog.search(collections=['dem90'], ids="Copernicus_DSM_COG_30_N00_00_E006_00_DEM")
for item in mysearch.get_items():
    print(item.id)

Leads to:

Traceback (most recent call last):
  File "...test.py", line 26, in <module>
    for item in mysearch.get_items():
  File ...pystac_client\item_search.py", line 415, in get_items
    for item_collection in self.get_item_collections():
  File "...pystac_client\item_search.py", line 403, in get_item_collections
    yield ItemCollection.from_dict(page, root=self.client)
  File "...pystac\item_collection.py", line 154, in from_dict
    raise STACTypeError("Dict is not a valid ItemCollection")
pystac.errors.STACTypeError: Dict is not a valid ItemCollection
@jaysnm
Copy link
Contributor

jaysnm commented Sep 20, 2021

Hi @fwfichtner

I have exactly same issue and now unable to proceed with search.get_all_items() of pystac client which is using stac_version to validate STAC_ITEM as you have pointed in your ticket.

@geospatial-jeff
Copy link
Collaborator

Thanks for the issue! Very good question. I can't explain why this is, but it should be fixed. The fields extension is supposed to return at minimum a valid STAC item and stac_version is a required field. PRs are welcome :)

@jaysnm
Copy link
Contributor

jaysnm commented Sep 20, 2021

Thanks for reverting @geospatial-jeff

I just found out that besides stac_version attribute that is required by pystac client, ODC tools require at least properties.proj:epsg and properties.odc:product attributes advertised in order to be able to to index STAC Items on OpenDataCube. None value returned by https://github.com/opendatacube/odc-tools/blob/develop/libs/stac/odc/stac/transform.py#L71 causes error on line https://github.com/opendatacube/odc-tools/blob/develop/libs/stac/odc/stac/transform.py#L333. properties.proj:epsg is required by line https://github.com/opendatacube/odc-tools/blob/develop/libs/stac/odc/stac/transform.py#L314

@geospatial-jeff
Copy link
Collaborator

geospatial-jeff commented Sep 20, 2021

properties.proj:epsg and properties.odc:product

That's interesting. I hesitate to include these in default_includes because not all users may implement these extensions. I'd be curious to understand what pydantic's default behavior is if you include a field that doesn't exist in the model. If that doesn't cause any fuss I think an argument could be made for including at least proj:epsg as it is mentioned in the best practices doc (https://github.com/radiantearth/stac-spec/blob/master/best-practices.md#common-use-cases-of-additional-fields-for-assets).

@kylebarron
Copy link
Contributor

kylebarron commented Sep 20, 2021

pydantic's default behavior is if you include a field that doesn't exist in the model

https://pydantic-docs.helpmanual.io/usage/model_config/

extra
whether to ignore, allow, or forbid extra attributes during model initialization. Accepts the string values of 'ignore', 'allow', or 'forbid', or values of the Extra enum (default: Extra.ignore). 'forbid' will cause validation to fail if extra attributes are included, 'ignore' will silently ignore any extra attributes, and 'allow' will assign the attributes to the model.

So you might want to consider extra: allow

@jaysnm
Copy link
Contributor

jaysnm commented Sep 21, 2021

Another feasible alternative to including custom extension fields by default is to make the fields extensible through environment variables. pydantic extra: allow would come in handy here!

moradology added a commit that referenced this issue Sep 21, 2021
@geospatial-jeff
Copy link
Collaborator

I don't think extra: allow is relevent here. In the sqlalchemy backend the inclusion/exclusion of fields happens upon exporting the model (https://pydantic-docs.helpmanual.io/usage/exporting_models/#advanced-include-and-exclude). But it looks like you can have fields specified in the includes that aren't part of the model and pydantic doesn't complain. Which means its feasible to think about including extension-specific fields like in default_includes.

>>> from pydantic import BaseModel
>>> class MyModel(BaseModel):
...     foo: int
... 
>>> inst = MyModel(foo=1)
>>> inst.dict(include={'bar'})
{}

I wonder how pgstac behaves.

@geospatial-jeff
Copy link
Collaborator

Closed with #268 . Thanks all 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants