-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Files endpoint #360
Files endpoint #360
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A single comment for now. I'll do another review later with pedantic syntax corrections when we all start to agree that this should go in.
Great effort on this though - and I'm not at all against an explicit /files
endpoint. It makes it more specific and clear where to place these kind of entries - instead of in a meta
or as a links
value... Although, the latter there might be a solution if a single file represents any one entry (e.g., a CIF file for a structure, or a single paper for a reference).
Finally, for completeness I'd like to keep the JSON API philosophy of keeping all links or rather URL-related values as a mix of a string, a Links object, and null
. For the specific attribute url
here, it's fine to not allow null
, but it doesn't make sense to me to disallow the use of a Links object - other than just because. It might be that one wants to specify a meta
value for the links that this links only works locally on a specific computer (which is very weird) or that the link points to an external resource. E.g., I could think of a case where a structure in an AiiDA DB comes from COD, so the file is kept and the url
can then both point to the COD link and state as much.
To counter-argue my last point, one could say that this information could also be put directly into the files
-entry's meta
field (that's at the same level as id
, type
, attributes
, etc.)
Thanks!
I agree that single file could be linked more easier, but I prefer more homogeneous approach.
My motivation for describing |
Looks like the CI was initially failing here due to the Fastly outage (which impacted PyPI), everything seems okay now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, except for some URL adjustments and a suggested addition.
Also a question: what is it in the present specification that says which entry types/endpoints are optional to implement? Based on what in the specification can I presently decide not to implement the /files
endpoint?
Co-authored-by: Rickard Armiento <[email protected]>
Co-authored-by: Rickard Armiento <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for writing this up @merkys, we should definitely add this functionality to OPTIMADE somehow.
I have two general comments (with more specific comments below):
- It looks like the aim isn't really to allow querying on a
files
endpoint, so perhaps we could simplify this by just adding the fields described here into themeta
of a "special" kind of JSON:API link entry? That way, instead of needing to define a new entry type (with corresponding endpoints), files can be served as JSON:APIlinks
(per structure entry), rather than therelationships
/included
route. - If we decide to go ahead with this approach (rather than my point above), does that make a
/files
endpoint mandatory? I would rather not maintain a database of files but instead serve everything on-the-fly (I think this would be a common use case for people with computational data). In that case, my/files
endpoint might return no data, but a particular file id/files/structure/1.cif
might have a response (where the file ID isstructure/odbx/1.cif
)...
I think we have discussed this approach in #211 (it was my initial suggestion), and the consensus there was that
I see your point in not serving the
On the other hand, if served on-the-fly, entry listing could be easily populated with just minimal data: {
"id": "1.cif",
"type": "files",
"attributes": {
"url": "/files/structure/1.cif"
}
} |
Just a thought – it is usually very useful when files in any file system have as a subset attributes mandated by POSIX; in particular atime, mtime, ctime. Should we maybe include them? |
I can in principle introduce provisions for optional |
@CasperWA @ml-evs @rartino @sauliusg Are there any blockers left for this PR? Please let me know if I did not address any of your comments. I think standardizing of the way we link in files is of relatively high priority. For example, we at the COD very much would want to provide CIF files for the COD entries, since CIF files are "the ultimate source of truth" for the crystal data at the COD. |
A comment arising from a discussion I just had with @CasperWA: how do we see this endpoint interacting with the proposed #364 endpoint? In my mind, this endpoint already fulfills many of the criteria we discussed at the workshop for Do we feel that this is overloading/missing the point of the current |
@ml-evs, @CasperWA: I gave #364 a glance, and to me Additionally, |
During the discussions about the trajectories, a point was raised that it could be a good idea to include data about the "role" of a file. I was wondering whether you prefer to discuss this as part of the Files endpoint PR or you would prefer if I open a seperate Issue for this. |
@JPBergsma I see the "role" as a relationship attribute. Thus I think "role" belongs in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
In #211 the need to provide links to files from entries has been discussed. @rartino proposed introducing
files
endpoint as the most universal way to deal with files, recycling JSON API relationships to provide links between files and other types of entries.Differences from suggestions in #211:
url
attribute is merely a string instead of JSON API links object, as I cannot think of a use case where the links object would be of use;files
. This could be done later on in other PRs.Closes #211. Addresses the original intent of #343 superseding it.
Edit:
url
redefined as JSON API links object.