Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replacement of retPath by "links" #86

Closed
petersilva opened this issue Apr 10, 2022 · 20 comments
Closed

replacement of retPath by "links" #86

petersilva opened this issue Apr 10, 2022 · 20 comments

Comments

@petersilva
Copy link
Contributor

as part of #78, the retrievalPath field is also rejected in preference for a complete URL for use by APIs.
The STAC standard is referenced as a model, with their "links" field.

@petersilva
Copy link
Contributor Author

Instead of a component catentated together to create an unambiguous url, the "links" mechanism proposes a set of links with relations. Each link is expected to have a relation (link-relation) with message. Which link relation is used to designate a retrieval URL? Is it always the same? or will it vary in different situations? I think people are currently proposing "canonical" as the link relation to use.

@petersilva
Copy link
Contributor Author

@josusky @tomkralidis

  • what relation should the link use?
  • Do we profile so that only that relation is permitted, or do we need to permit other links?

my current v04 prototype is missing the links field because I don't know what links should contain yet.

@petersilva
Copy link
Contributor Author

The example from ET-AT specifies a "canonical" relation, but it isn't clear if that is an example, or a specification. the ET-AT design has caches... will the links supplied by caches also be canonical ? or perhaps: alternate, duplicate or ... can compliant clients simply take the first listed in link in links, or do they have to pay attention to "rel" and if so what are the rules for deciding among multiple links with different rel?

@petersilva
Copy link
Contributor Author

@petersilva
Copy link
Contributor Author

I can't find any standard source for "links"... it's not part of JSON or GeoJSON... I guess it is from STAC?
https://github.com/radiantearth/stac-spec/blob/master/best-practices.md#use-of-links

@petersilva
Copy link
Contributor Author

petersilva commented May 10, 2022

ok reading the above... "alternate" is for alternate forms (e.g. TAC vs. BUFR) so not good for binary copies. "duplicate" would appear more appropriate for cache links.

@tomkralidis
Copy link

tomkralidis commented May 11, 2022

@josusky @tomkralidis

* what relation should the link use?

* Do we profile so that only that relation is permitted, or do we need to permit other links?

my current v04 prototype is missing the links field because I don't know what links should contain yet.

We should allow for 1..n links, with at least a rel=canonical to identify the actual download link.

@tomkralidis
Copy link

The example from ET-AT specifies a "canonical" relation, but it isn't clear if that is an example, or a specification. the ET-AT design has caches... will the links supplied by caches also be canonical ? or perhaps: alternate, duplicate or ... can compliant clients simply take the first listed in link in links, or do they have to pay attention to "rel" and if so what are the rules for deciding among multiple links with different rel?

This depends on data policy [core/recommended], or other considerations (i.e. satellite data too heavy to be cached) to decide whether the data is cached, or not. If the data is cached, then rel=canonical is the cache, else the upstream/originating source is rel=canonical.

@tomkralidis
Copy link

I can't find any standard source for "links"... it's not part of JSON or GeoJSON... I guess it is from STAC? https://github.com/radiantearth/stac-spec/blob/master/best-practices.md#use-of-links

See discussion in https://gist.github.com/tomkralidis/bcd7067b02e5321478ff219d3edf9cd5?permalink_comment_id=4145491#gistcomment-4145491

@petersilva
Copy link
Contributor Author

petersilva commented May 11, 2022

@tomkralidis That's not helpful... I'm asking for what standard is being referenced. Does this mean there is no standard being referenced, and links is an invention for this application? at which point I fail to understand the interop argument.

For the other point about rel= ... when we start adding multiple links to a message, it becomes interesting. Is the original included as a duplicate?
walking through a case:

  • nc produces an "original" posted to a global broker on a channel the caches are subscribed to. call this message: p0nc for product 0 (identified by the hierarchical product identifier) announced with canonical link from the nc.
  • caches c1, c2, c3, ... cn. are expected to download from nc or a peer cX to have a cached copy.
  • c1 gets p0nc, downloads and re-publishes p0c1nc with a link from c1 as canonical, and nc as duplicate. (this notation the first node identifier after
  • c2 gets p0nc, and does likewise, with p0c2nc
  • c2 gets m0c1, and publishes (without download) p0c1c2nc
  • c1 gets m0c2, and publishes (without download) p0c2c1nc

Question: when c1 gets the message from c2, should it make a new message p0c1c2 with other nc and the other cache listed as duplicates, but then we have two messages with the same content, but p0c1c2, and p0c2c1 where they each have three links but different ones designated as canonical. so for each product, we end up circulating messages totalling the square of the number of caches (if the nc participates, then c+1) so in the case of three caches: 9 messages per product.

That's what you want?

@petersilva
Copy link
Contributor Author

petersilva commented May 11, 2022

followup... a way of reducing from n**2, would be to, using examples of other messages in the set:

  • p0c1c2c3nc, p0c1c3c2nc.

For all practical purposes these messages are identical, consider them as such. They should be assigned different message id's according to current specification ET-AT. That's also what we want?

@tomkralidis
Copy link

@tomkralidis That's not helpful... I'm asking for what standard is being referenced. Does this mean there is no standard being referenced, and links is an invention for this application? at which point I fail to understand the interop argument.

As mentioned we are leveraging the OGC standards for links. OGC API - Features provides a link model which is based on OGC API - Common (http://docs.ogc.org/DRAFTS/19-072.html#link-conventions).

@petersilva
Copy link
Contributor Author

That last link is very helpful. Thanks!

@antje-s
Copy link
Contributor

antje-s commented May 23, 2022

For better understanding...
Example:

  1. WIS2 Data Producer Center publish messages with link to their target with rel=canonical
  2. WIS2 Global Caches re-publish the message with cache link for download with rel=canonical
  • also including original link as rel=original ?
  1. WIS2 Message Broker re-publish all messages not modified (as received from subscriptions to Global Caches AND as received from Producer Centers)
  2. NO WIS2 Center should re-publish modified messages
  • if there are special agreements with partners that e.g. publish is to be taken over, message is published only by this WIS2 Center as rel=original

All subscribers should use the link with rel=canonical for automated download (standard case)

How would you design messages for data in different download formats?

  1. should a separate message be created for each download format?
  2. OR should the message contain further links with "rel": "alternate" and the respective other type value?

With 2 less messages are transmitted but therefore more logic in the client would be necessary...

@josusky
Copy link
Contributor

josusky commented May 26, 2022

I am getting confused here. The list at [https://defs.opengis.net/vocprez/object?uri=http%3A//www.opengis.net/def/rel] shows some other options. To me, the most suitable seems "data" (aka "http://www.opengis.net/def/rel/ogc/1.0/data"). That would allow publishing notifications about data available in multiple formats at once like this:

"links": [
    {
      "href": "http://www.example.com/data/4Pubsub/my-data.bufr",
      "rel": "http://www.opengis.net/def/rel/ogc/1.0/data",
      "type": "application/x-bufr"
    },
    {
      "href": "http://www.example.com/data/4Pubsub/my-data.json",
      "rel": "http://www.opengis.net/def/rel/ogc/1.0/data",
      "type": "application/json"
    }
  ]

Of course, we may decide to allow only one "data" link in our profile. But how do these OGC link "rel" types interoperate with the IANA registered types, that is with "canonical", "duplicate", "alternate" etc.?

@tomkralidis
Copy link

For a data object with multiple representations, we should publish a single message with multiple link objects.

The OGC specific rel types are defined in various OGC specifications. This means they are extensions to the IANA link relations and allowable in the context of the given specification (example 1, example 2).

This also means we (WMO) can define our own rel types as part of the message specification if required.

@petersilva
Copy link
Contributor Author

There is a new format here:

https://github.com/wmo-im/wis2-notification-message

Should all discussion of message format transition there?

@josusky
Copy link
Contributor

josusky commented Jul 11, 2022

Sorry, I did not notice that proposal. It would be nice to work in one repository, preferably in this one, that is https://github.com/wmo-im/GTStoWIS2/tree/main/message_format. I created a PR there earlier today.

@petersilva
Copy link
Contributor Author

no worries... I´m asking ET-AT to clarify what they are expecting tt-protocols to do. Given all this replacement proposal, I´m probably going to pause the committee work for now.

@petersilva
Copy link
Contributor Author

ET-AT taking over specification. Further discussion here:

https://github.com/wmo-im/wis2-notification-message

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants